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Preface to the English 
edition 


This book is a translation from the French Arithmétique originally pub- 
lished by Calvage & Mounet. Apart from minor corrections and a couple 
of examples added in Chap. 3, I have left the book unchanged. I wish to 
thank all the people from Springer for showing interest and making this 
new version possible. The book is already dedicated to my parents but 
I cannot avoid thinking my father would have been very happy to see me 
publish this book in English. Finally my heartiest thanks go to Sarah Carr, 
who showed both enthusiasm and expertise in translating the text, she even 
made me forgive her american spelling. 


Preface to the French 
edition 


Amis lecteurs qui ce livre lisez, 

Despouillez vous de toute affection, 

Et, le lisant ne vous scandalisez. 

Il ne contient mal ny infection. 

Vray est qu’icy peu de perfection 

Vous apprendrez, si non en cas de rire : 
Aultre argument ne peut mon cueur eslire. 
Voyant le dueil qui vous mine et consomme, 
Mieulx est de ris que de larmes escripre 
Pour ce que rire est le propre de l’-homme. 


FRANGOIS RABELAIS (GARGANTUA) 


Arithmetic is certainly the oldest mathematical activity. The use of the 
concept of a whole number, numeral systems and the operations of ad- 
dition, multiplication and division can be found in all civilizations. The 
invention of zero appears to have come from India. Traces of arithmetical 
operations have been identified on bones dating back to the Paleolithic Era, 
on Mesopotamian clay tablets, on Chinese turtle shells and on Egyptian 
papyrus; the Incas, who did not—so it seems—have writing, did develop 
an evolved numeral system based on knots in strings, called quipus. 


In our times, number theory is a branch of mathematics which draws its 
vitality from its rich history. We cite Pythagoras, Euclid, Diophantus, 
Fermat, Euler, Lagrange, Legendre, Gauss, Abel, Jacobi, Dirichlet, Ga- 
lois, Riemann, Hilbert, stopping here at the XIXth century. It is also 
traditionally nourished through interactions with other domains, such as 
algebra, algebraic geometry, topology, complex analysis, harmonic analy- 
sis, etc. More recently, it has made a spectacular appearance in theoretical 
computer science and in questions of communication, cryptography and 
error-correcting codes. 


The notion of a number has actually been progressively extended and en- 
riched throughout history. All of the civilizations considered first the whole 
numbers (with or without zero). Since the work of Dedekind and Peano 
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at the end of the XIXth century, we consider the set of natural numbers, 
traditionally denoted by N. Advances in logic, calculation techniques and 
algebra then led us to add negative integers and obtain the set traditionally 
denoted by Z and to introduce fractions (the Greeks spoke of proportions) 
and obtain the set now denoted by Q. Very early on, the necessity of 
considering even more extraordinary numbers, such as 7 (the proportion 
of the circumference of a circle to its diameter) or 2 (the proportion of 
the length of the diagonal of a square to one of its sides) appeared, but it 
was only very much later that the notions of a real number and a complex 
number were clarified. The set of real numbers is today denoted by R and 
that of complex number by C. The latter were called imaginary numbers 
for a long time. The two concepts—real and complex numbers—were only 
rigorously defined in the XIXth century. The first rational approximations 
of the number s—computed by Archimedes and others—can be viewed as 
the first chapter in the history of Diophantine approximations. We will 
cite one more development, even if it did not become what its inventor— 
Hamilton—had wished it to become, but which has nevertheless proven 
very useful: the quaternions, the set traditionally known as H. 


Some other fundamental objects, known at least since the time of Euclid 
and Pythagoras, are prime numbers, traditionally denoted with p—they 
are so important that in number theory classes, we do not even bother to 
specify that a number p is prime—and polynomial equations (constructed 
using the laws of arithmetic and multiplication) or Diophantine equations. 
“Fermat’s little theorem”, which we write today as a? = amodp, can be 
considered to be a turning point in the history of number theory in the 
XVIIth century. The great problem left by Fermat, somewhat accidentally, 
to mathematicians of subsequent centuries also left its mark on history, 
culminating in the solution given by Wiles (1995); it can be stated by 
saying that if n > 3, then there are no non-zero integers x,y,z such that 
xz” +y”" = z”. Interest in considering subfields and subrings, such as the 
Gaussian integers, Zi], or Kummer’s cyclotomic integers, Zlexp(27i/n)], 
came about little by little, and they were developed as a consequence of 
the theory of algebraic numbers. Modern arithmetic—it could be more 
prudent to say contemporary arithmetic—is also enriched by the study of 
finite quotients such as the congruence rings Z/NZ and the finite fields F,. 
A result dating back to the XIXth century could be considered as a key 
ingredient in these developments: the quadratic reciprocity law. Stated by 
Legendre and proved by Gauss, it says that “if p and q are odd primes and 
p or q is congruent to 1 modulo 4 (resp. p and gq congruent to 3 modulo 4), 
then p is asquare modulo q if and only if q is (resp. is not) a square modulo 
p’. In another direction, p-adic numbers, the set traditionally denoted by 
Q,, were invented by Hensel at the end of the XIXth century; we can view 
the fields Q,, as ultrametric completions of the field of rationals Q. 
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This book is naturally a mix of these various notions of numbers. It offers a 
basic number theory course, followed by an initiation to some contemporary 
research areas. It is written at the level of an advanced undergraduate 
course fading into a first-year graduate course, including results which are 
more advanced but which can be appreciated without having to rely on 
“heavy” background knowledge. This book is thus divided into two parts 
which have a gradually different tone. 


a) The first part (Chaps. 1 to 4) corresponds to an advanced undergraduate 
course. All of the statements given in this part are of course accompanied 
by their proofs, with perhaps the exception of some results appearing at 
the end of the chapters. 


b) The second part (Chaps. 5 and 6 and the appendices) is of a higher level 
and is relevant for the first year of graduate school. It contains an intro- 
duction to elliptic curves and a chapter entitled “Developments and Open 
Problems”, which introduces and brings together various themes oriented 
toward ongoing mathematical research. Many of the statements about el- 
liptic curves, often coming from courses given at l’Université Paris 7 and 
the magistére de la rue d’Ulm, are proven, but the panorama proposed in 
Chap. 6 contains more statements without proof or which are conjectural 
than proven statements. 


On this note, the first four chapters end with a copious list of exercises 
of varying difficulty; some of them are direct applications of material from 
the book and others, while not necessarily more difficult, require or develop 
some aspect not found in the book. 


Number theory is a multifaceted and flourishing subject. Every author/ 
number theorist is condemned to choose between the many themes of this 
discipline. Our guiding principles in developing the present book were: 


— the wish to give some idea of the very large variety of mathematics 
useful for studying numbers; 


— the “necessity” to look at deep and classical themes, such as Gauss 
sums, Diophantine equations, the distribution of prime numbers and 
the Riemann zeta function; 


— the will to introduce the principle, “arithmetic plays a role in modern 
applied mathematics”. Cryptography and error-correcting codes are 
introduced and used as a motivation for concepts such as cyclotomic 
polynomials and the cyclicity of (Z/p™Z)* and F7; 


— the effort to include some recent proofs. The polynomial primality al- 
gorithm (Agrawal-Kayal-Saxena, 2002) is presented, and its correctness 
is proven in detail (moreover, the proof is on the level of an advanced 
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undergraduate course!). The proof that we give of the prime number 
theorem is essentially due to Newman (1980) and modified by Zagier 
(1997); 


— the desire to approach subjects of contemporary research: elliptic curves, 
rational points on algebraic varieties, “zeta” and “L” functions, etc.; 


— and obviously the incomparable beauty of arithmetic (everybody knows 
that the others are all jealous of it). 


The prerequisites for this text are very modest, at least for the first four 
chapters: undergraduate algebra is assumed (linear algebra, abelian groups, 
rings and divisibility), as well as a little topology of R” for Chap. 3. In 
addition to elementary real analysis, Chap. 4 is also based on the theory of 
complex analysis (holomorphic functions, power series, the Cauchy formula, 
the residue formula, the complex logarithm) of which we will give a brief 
overview as a reminder. The first four sections on elliptic curves (Chap. 5) 
are relatively elementary, even if the material is a little denser than before, 
and use only simple properties of the projective plane outlined at the be- 
ginning of Appendix B. The last section of Chap. 5 and all of Chap. 6 are 
less accommodating and recall or allude to various more-advanced notions. 


We will now finish with a brief description of the individual chapters, which 
are largely independent of each other. 


The Ist chapter, “Finite Structures”, provides a systematic study of the 
congruence groups and rings Z/NZ and finite fields F,, as well as their 
groups of invertible elements (Z/NZ)* and Fj. We also confirm the ubiq- 
uity of Gauss sums, studied first in their own right, then used to prove the 
quadratic reciprocity law and to count the number of solutions of diagonal 
equations over a finite field. 


The 2nd chapter, “Applications: Algorithms, Primality and Factorization, 
Codes”, begins with the study of the complexity of basic arithmetic oper- 
ations (addition, multiplication, computation of the gcd, inversion modulo 
N, exponentiation, calculations in finite fields). We then briefly introduce 
the RSA system—the star of public key cryptography procedures—which 
governs credit cards, internet transactions, etc. This is the motivation for 
the core of this chapter: the study of algorithms which determine whether 
an integer is prime or composite. The mathematical prerequisites are those 
of Chap. 1, plus an elementary statement coming from analytic number 
theory (which is proven in the first section of Chap. 4). We will also in- 
troduce error-correcting codes—used in compact disc technology and the 
transmission of data—which are another industrial application of number 
theory and serve as a motivation for the study of the decomposition of 
cyclotomic polynomials over a finite field. 
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The 8rd chapter, “Algebra and Diophantine Equations”, is an initiation to 
the study of some classical problems, such as which numbers are expressible 
as the sum of (two, three or four) squares, integer solutions to Pell’s equa- 
tion 2? — dy? = 1 and integer solutions to Fermat’s equation 2” + y” = 2” 
(done here for n = 3 and 4). We then move on to algebraic number the- 
ory: number fields, rings of algebraic integers, decomposition of ideals into 
prime ideals, the group of units and the finiteness of the ideal class group. 
In addition to commutative algebra, the tools used are a little bit of the 
geometry of numbers (lattices, Minkowski’s theorem) and Diophantine ap- 
proximations (Dirichlet’s theorem and continued fractions). 


The 4th chapter, “Analytic Number Theory”, is dedicated to the study of 
the distribution of prime numbers; the two main theorems are the prime 
number theorem: “the number of prime numbers smaller than x is asymp- 
totically equivalent to z/log x” and the theorem on arithmetic progressions: 
“there are infinitely many prime numbers congruent to m modulo n when 
m and n are relatively prime”. Apart from some elementary statements 
(comparison of series and integrals, etc.), the fundamental tool that we use 
is complex analysis. A brief summary of the necessary tools is included 
in the chapter. This chapter also introduces a fundamental mathematical 
object, the “Riemann zeta function”, and closes with an introduction to the 
Riemann hypothesis, which is probably the most important open problem 
in mathematics. 


The 5th chapter, “Elliptic Curves”, is an introduction to the rich theory of 
equations of the type y? = 23 +axr+ 6. We will give you a little bit of 
projective geometry and examine the group law on a cubic, the theory of 
heights, notably the Néron-Tate height, before proving the Mordell-Weil 
theorem: “the group of rational solutions of this equation is a finitely gen- 
erated abelian group”. In the following section, we will prove (modulo a 
result proved in Chap. 6) Siegel’s theorem, “the set of solutions where x 
and y are integers is finite”. We will finish by connecting this to the theory 
of elliptic functions and by formulating both the extraordinary theorem 
of Wiles (1995), “Every elliptic curve defined over Q is modular”, and the 
famous Birch & Swinnerton-Dyer conjecture, which relates the rank of the 
group of rational solutions to the behavior at s = 1 of the Dirichlet series 
associated to an elliptic curve. 


The 6th chapter, “Developments and Open Problems”, goes back to some 
of the subjects of the previous chapters, and pushes them to the level of 
current research; in particular, each section contains at least one unsolved 
problem. Of course, some statements must be given without proof, and the 
prerequisites to read this chapter are more advanced, even though we did 
make an effort to give all of the necessary definitions and some essential 
ideas. The six themes that we chose for this chapter are: 
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the Weil conjectures, or the computation, which we already started 
in Chap. 1, of the number of points on an algebraic variety over a 
finite field. We obtain a precise description of the zeta function of a 
variety over a finite field and, at the same time, a first glimpse of the 
connections between arithmetic, geometry and topology; 


the conjectural dictionary, proposed by Serge Lang, between the qual- 
itative properties of the set of rational points on an algebraic variety 
over a number field and the geometric properties of the variety, as well 
as the properties of the associated analytic complex variety. For al- 
gebraic curves, this dictionary is a theorem and gives us the following 
trichotomy of curves: curves of genus 0 (conics and the projective line), 
curves of genus 1 (elliptic curves) and curves of genus > 2 (the others!); 
however, very little is known for varieties of dimensions at least two; 


an introduction to p-adic numbers, with the goal determining when it 
is appropriate to apply the “Hasse principle”, which, in its most elemen- 
tary form, asks if an equation f(x1,...,%,) = 0 has an integral solution 
whenever it has an integral solution modulo N for every integer N. A 
key tool in this context, is “Hensel’s lemma”, which we can consider to 
be an analogue of Newton’s method for finding real-valued solutions of 
equations. We will also sketch the beginnings of the theory of adeles 
and ideles: the global additive and multiplicative groups constructed 
starting with the local fields Q, and R; 


a presentation of the fundamental results of Roth (1955) on rational 
approximations of algebraic numbers and Baker (1966) on the tran- 
scendence of linear combinations of logarithms of algebraic numbers. 
We will then give the details of the proof of Thue’s theorem (a prede- 
cessor to and weaker than Roth’s theorem) and the method of applying 
Baker’s theorem to Diophantine equations. This provides an oppor- 
tunity to carefully examine so-called “transcendence” methods and to 
introduce the reader to problems of computational effectiveness; 


the “a,b,c’ conjecture, which is a totally elementary statement and 
whose proof would have some remarkable consequences, is briefly in- 
troduced. Its connections to elliptic curve theory are also presented. 
This allows us to elaborate on its possible links to the great theorem 
of Wiles; 


zeta functions associated to algebraic varieties (the function associated 
to an elliptic curve is described in detail in Chap. 5) are introduced, 
as well as their connection to the theory of representations of groups. 
Modular forms and Galois representations also make an appearance. 
The end of this section touches the tip of the iceberg of Grothendieck’s 
theory of motives and the Langlands program. 
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Appendix A, entitled “Factorization”, follows up on the themes introduced 
in Chap. 2, but relies on Chaps. 3 (number fields) and 5 (elliptic curves) 
to describe two recent factorization algorithms for an integer N: Lenstra’s 
algorithm (1986) which uses elliptic curves and an algorithm of Pollard, 
Lenstra et al (1993) called the “number field sieve”. We will also briefly 
discuss the problem of factoring polynomials over a finite field or Z. 


Appendix B on “Elementary Projective Geometry” is an introduction to 
algebraic projective geometry. Some elementary statements on lines, conics 
and cubics are proven and used in Chap. 5 to construct the group law on 
a projective plane cubic. We will also describe Hilbert’s Nullstellensatz in 
detail and prove Bézout’s theorem: two projective plane curves of degrees 
d, and dg with no common components intersect at exactly d,d2 points 
(counted with appropriate multiplicities). 


Appendix C, entitled “Galois Theory”, is an attempt to fill an intentionally- 
made gap. We actually avoided relying on any Galois theory in this text 
(except in the last section of Chap. 6) since it is either absent from the 
classical university curriculum or taught in the first year of graduate school. 
It is however such an important tool in modern number theory, that it 
seemed to us to be necessary to include as a supplement, if only a brief 
one. We will namely explain how Chebotarev’s theorem brings together 
analytic number theory and Galois groups by generalizing the theorem on 
arithmetic progressions. This appendix, in particular the description of 
the concept of a Galois representation, is a prerequisite for reading the last 
section of Chap. 6. 


The bibliography is composed of two parts: the first one gives nine reference 
books which can be read in parallel with this one, as well as commentaries 
on them; the second part is a more copious list of references to original 
articles and historical and more advanced books. In [28], you can find 
a relatively complete overview of the history of number theory up to the 
beginning of the XXth century. The reference [34] contains numerous open 
but relatively elementary problems. 


Many people were kind enough to take the time to read parts of this book 
and then bring to my attention remarks on the content as well as the 
editing. To that effect, I would first like to thank Dominique Bernardi for 
his careful reading of the entire text. He thus saved the eyes of the happy 
readers from many misprints and more than one mistake. If there are more 
of them, only I can take responsibility for them. Olivier Bordellés, Nicolas 
Ratazzi, Marie-France Vigneras and Michel Waldschmidt suggested some 
improvements and pointed out some insufficiencies. I owe a large part of 
Chap. 2 to numerous discussions with Sinnou David and Jean-Francois 
Mestre. It would have been very difficult for me to complete this text 
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without the encouragement and suggestions of Alberto Arabia and Rached 
Mneimné. My mathematical and arithmetical education was nourished for 
many years by Monday morning lectures by Jean-Pierre Serre at Collége 
de France. 


With this, I thank you all heartily. 


Last but not least, this book would not exist without my students, whose 
listening, reactions, moments of silence and questions often motivated and 
reorientated me while I was teaching them in front of a blackboard smeared 
with chalk, the beauties of arithmetic. 
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Chapter 1 


Finite Structures 


“I hope good luck lies in odd numbers. Away! go. 
They say there is divinity in odd numbers, either in nativity, chance or death. Away!” 


WILLIAM SHAKESPEARE (THE Merry WIvEsS OF WINDSOR) 


In this chapter, the theory of congruences will lead into the study of the ring 
Z/nZ for n > 2, as well as the group (Z/nZ)* of its invertible elements 
with respect to multiplication. Furthermore, for every power of a prime 
number, q = p!, there exists a unique finite field, up to isomorphism, of 
cardinality q, denoted F,. We will review the construction of these objects 
and state their main properties. In the following sections, we expand on 
some structures and applications, notably Gauss sums, Legendre and Jacobi 
symbols and the number of solutions of congruences. 


1. Review of Z/nZ, (Z/nZ)*, Fg and F% 


The group Z is, up to isomorphism, the only group which is cyclic (gener- 
ated by one element) and infinite. All of its subgroups are of the type mZ, 
for m > 0. The set Z is also equipped with a multiplication which makes 
it a commutative ring. In this ring, we have the notions of divisibility and 
of GCD and LCM (greatest common divisor and least common multiple). 
In the case of Z, the notion of an ideal coincides with that of a subgroup. 
From this, we can easily deduce the following theorem. 


1.1. Theorem. (Bézout’s lemma) Let m,n € Z and let d be their GCD. 
Then there exist u,v € Z such that 


d=um-+on. 


Proof. The set H := mZ+nZ = {um+vn | u,v € Z} is clearly a 
subgroup, therefore it is of the form d’Z, and there exist u and v such that 
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d' = um-+un. Since d divides m and n, we see that d divides um+vun = d’. 
But m and n are elements of H, so d’ divides m and n, and therefore d' 
also divides d. It follows then that d = d’ (assuming both of them are 
positive). 


The group Z/nZ is, up to isomorphism, the unique cyclic group with n 
elements, i.e., generated by one element of order n. We will now study the 
generators of this group. 


1.2. Proposition. Let m € Z and let m denote its class in Z/nZ. The 
following three properties are equivalent. 


i) The element m is a generator of Z/nZ. 
ii) The integers m and n are relatively prime. 
iii) The integer m is invertible modulo n, in other words, there exists m! € 
Z such that mm’ = 1modn or equivalently mm’ = 1 € Z/nZ. 


Proof. If m generates Z/nZ, then there exists m’ € Z such that m’m = 
1 € Z/nZ; hence mm’ = 1modn, which means that m is invertible modulo 
n. If mm’ = 1modn, then mm’ = 1+ an, and therefore m is relatively 
prime to n. If m is relatively prime to n, then by Bézout’s lemma, there 
exist a and b such that am + bn = 1, hence am = 1 € Z/nZ, and therefore 
m generates Z/nZ. 


The group of invertible elements of the ring Z/nZ is therefore equal to 
(Z/nZ)* = {m € Z/nZ | m is relatively prime to n} 
= {generators of Z/nZ}. 


1.3. Definition. We denote by $(n) := card(Z/nZ)* the Euler totient of 
the integer n. 


By noticing that gcd(m, p”) = gcd(m, p), we can easily deduce that if p is 
prime, ¢(p") = p” — p"-! = (p—1)p"—?!. In general, to calculate ¢(n), we 
make use of the following classical lemma. 


1.4. Proposition. (Chinese remainder theorem) Let m,n € Z, and 
suppose that m and n are relatively prime. Then the groups Z/mnZ and 
Z/mZ x Z/nZ are naturally isomorphic. Furthermore, this isomorphism 
is also a ring isomorphism and consequently induces an isomorphism of 


(Z/mnZ)* and (Z/mZ)* x (Z/nZ)*. In particular, 6(mn) = o(m)d(n). 


Proof. Consider the map f : Z = Z/mZ x Z/nZ given by « + (amodm, 
xmodn). It is a group homomorphism with kernel lem(m,n)Z, hence we 
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have the injective map 
f:Z/lem(m,n)Zo Z/mZ x Z/nZ. 


Since m and n are relatively prime, lem(m,n) = mn, and by considering 
the cardinalities the two groups, then homomorphism '- must be an iso- 
morphism. For any rings A and B, we have (A x B)* = A* x B*, hence 
the second assertion. 


1.5. Remark. A function f : N* — C is generally known as an arithmetic 
function. We say that an arithmetic function f : N* — C is multiplicative 
(resp. completely multiplicative) if f(mn) = f(m)f(n) for all m,n which 
are relatively prime (resp. for all m,n). Thus the Euler totient ¢ is multi- 
plicative but not completely multiplicative; notice however that $(mn) is 
always greater than or equal to ¢(m)¢(n). 


The description of the subgroups of Z/nZ is fairly simple. 


1.6. Proposition. For any integer d > 1 which divides n, there exists a 
unique subgroup of Z/nZ of order d: namely, the cyclic subgroup generated 
by the class of n/d in Z/nZ. 


Proof. Assume n = dd’. The element x = d’ € Z/nZ is therefore of order 
d since obviously dx = 0, and if cr = 0, then n divides cd’, so d divides 
c. Now let H be a subgroup of Z/nZ of order d. Let s be the canonical 
surjection s : Z— Z/nZ. We know that s~!(H) = mZ is generated by 
m, hence H is generated by m € Z/nZ. We then have dm = 0, hence n 
divides dm, and therefore d’ divides m, so the subgroup H is contained in 
the subgroup generated by d’ and is therefore equal to this subgroup. 


An application of this proposition is the following formula (that we will use 
further down): 


n= S- ¢(d). (1.1) 


d|n 


To see why this is true, we write Z/nZ as the disjoint union of sets, where 
each set contains the elements of order d and d divides n. The number of 
elements in each set is the number of generators of the unique subgroup 
of order d, and since the latter is isomorphic to Z/dZ, the number of such 
generators is ¢(d). 


A finite field k necessarily has finite characteristic equal to a prime num- 
ber p and therefore contains Z/pZ = F, (the homomorphism Z — k has 
kernel nZ with n > 0, and since Z/nZ © k, n must be prime). The 
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dimension of k, viewed as a vector space over Fp, is finite, say f, and 
therefore card(k) = p/. We know that card(k*) = p/ — 1, so all of the 
elements of k* satisfy gpl = 1, and therefore all of the elements of k 
satisfy a?’ =x. Conversely, we can construct a finite field with cardinality 
v! as follows: we consider an extension K of F,, = Z/pZ in which the poly- 
nomial P = XP’ — X splits completely into pf linear factors. We then set 
k := {a € Kk | P(x) = 0}. Since P’(X) = —1, the roots of P are simple and 
card(k) = deg(P) = p/; furthermore, k is a subfield of K because in char- 
acteristic p, the “Frobenius map” given by ¢(x) = 2? is a homomorphism 
of fields; the same holds true for ¢/. In other words we have: 


(ap? = ary and (eae y)” =a 4 y*, 


From general field theory, we know that the field k of order pf is unique, 
up to isomorphism, and is denoted by Fs. The following statement sum- 
marizes these notions. 


1.7. Theorem. Let p be a prime number, f >1 and q= pf. There exists 
a unique finite field, up to isomorphism, of order q. The elements of Fg 
are the roots of the polynomial X41 — X € Z/pZ|X]. 


1.8. Corollary.’ Let q = pf and F, the field defined above. The subfields 
of F, are isomorphic to Fa, where d divides f. Conversely, if d divides f, 
there exists a unique subfield of Fg isomorphic to Fa: it is exactly the set 


of elements which satisfy a =o. 

Proof. If Fp C k C Fg, then card(k) = p¢ with d= [k: F,], and k © Fya. 
Furthermore, f = [F, : F,] = [Fq : &l[k : Fp], and therefore d divides f. 
Conversely, if d divides f, f = ed, then every element (in an extension of 


F,,) which satisfies a?" = x also satisfies x?’ = x?** = x and is therefore in 
F,. These elements form a subfield isomorphic to Fy<. 


In practice, we construct the fields F,,5 as follows: we choose an irreducible, 
monic polynomial of degree f, say P € F,[X] (the existence of such a 
polynomial is equivalent to the existence of an element a € Fy such that 
F,,, = F,(q@) and is guaranteed by invoking, for example, Lemma 1-2.1 
below) and we represent F,,; as F,[X]/PF,[X]. An element of F,,+ can be 
seen as a polynomial of degree < f —1 with coefficients in Z/pZ. Addition 
is the obvious addition, and the multiplication rule is simply polynomial 
multiplication, followed by taking the remainder gotten from the division 


1This statement can be reinterpreted in terms of Galois theory (see Appendix C). 
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algorithm. For example, 
Fy = F2[X]/(X? 4+ X + 1)F2[X], Fs = Fo[X]/(X? + X + 1)F2[X], 
Fig = Fo[X]/(X* + X3 +X? 4. X + 1)F2[X]. 


2. The Group Structure of (Z/nZ)* and F% 


In order to describe the structure of these groups, we start by proving the 
following lemma, which is interesting in and of itself. 


2.1. Lemma. Let k be a field and G a finite subgroup of k*. Then G is 
cyclic. In particular, (Z/pZ)* or more generally F% is cyclic. 


Proof. Set n := card(G), and let w(d) be the number of elements of order 
din G. It is clear that n = }7,,, Y(d). Let d be an integer which divides 
n: either there are no elements of order d in G in which case y(d) = 0, or 
there exists one which generates a cyclic subgroup H of order d. All of the 
elements of H are solutions to the equation X% = 1, but since k is a field, 
such an equation has at most d roots in k; all of the elements of order d 
are therefore in H, and there are ¢(d) of them because H ~ Z/dZ. Hence 
(d) is either zero or $(d), but since n = )1y),¥(@) = Vayn P@ (by 
(1.1)), we see that w(d) = $(d) for every d which divides n. In particular, 
w(n) = (n) > 1, which implies that G is cyclic. 


From what we have seen, if n = pf! ---p@s, then 
(Z/nZ)* = (Z/pyZ)* x «++ x (Z/pssZ)*, 


and in particular 


We will now describe the structure of the groups (Z/p°Z)*. 


2.2. Proposition. Let p be prime anda > 1. 
i) If p is odd, then (Z/p°Z)* is cyclic. 

ii) Ifp=2 anda > 3, then (Z/2°Z)* = Z/2°-2Z x Z/2Z, which is not 
cyclic. However, (Z/2Z)* = {1} and (Z/4Z)* = Z/2Z are cyclic. 
Proof. If «= 1, we have seen that (Z/pZ)* = Fy is cyclic. When a > 1, 

we use the element p+ 1. 
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2.3. Lemma. Let p be an odd prime. The class of p+1 in (Z/p°Z)* has 
order p°—}, 


Proof. (of Lemma, 1-2.3) We first prove the congruence 
(p +1)” =1+4p**! mod p**? 

by induction. For k = 0, the congruence is trivial. For k = 1, we have 
(p +1)? =1+ (?)pt (8)p? = 14 p? + p3(p — 1)/2mod p’, and the latter 
is of course congruent to 1+ p? if p is odd (notice however that 3? F 
1 + 2? mod 23). Assume now that k > 1 and (p +1)?" ' =1+p* + ap*+!. 
Then (p+1)?" =(1+pe+ apt)? = 1+ p(p*+ap*+) = 14-p*+! mod p*+? 
since 1+ 2k > k+2. In particular, we see that (p + Pe" = 1modp*, 


but (p +1)?" > =1+p%! ¥1mod p%, which implies that p+ 1 has order 
pilin (Z/p°Z)". 


We can now finish the proof of the proposition for p odd. Let « € Z 
such that « modulo p generates (Z/pZ)*, i.e., has order p— 1 in (Z/pZ)*. 
Therefore has order m(p — 1) in (Z/p*%Z)*, and hence y = Z™ has order 
exactly p— 1 in (Z/p%Z)*. The element u := y(p+ 1) therefore has order 
p°—1(p — 1) because p*! and p — 1 are relatively prime, which gives us 
that u is a generator of (Z/p°Z)*. 


2.4. Lemma. Let a > 3. The class of 5 in (Z/2°Z)* has order 2°~?. 
Furthermore, the class of —1 does not belong to the subgroup generated by 
the class of 5. 


Proof. (of Lemma 1-2.4) We first show by induction that 
5? = 142"? mod at. 


The congruence is trivial for k = 0, and for k = 1 we check that 25 = 5? = 
1+23 =9mod2?. Therefore, we can assume that 52> | = 1+2*+!+a2k+2, 
Then 52° = (1+ 2#+1 4 q2h+2)2 — 142(2k+14 qgkt2) 4 92(K+1) (1490)? = 
1+2'+2 mod 2'+3, In particular, 52° ” = 1mod 2°, but 52° > =142° 14 
1mod 2°, so 5 has order 2°~?. For the second assertion, observe that for 
every integer m, we have 5” = 14 —1 mod 4. 


For the proof of the second part of the proposition, we can assume that a > 
3 (actually, we see immediately how to calculate (Z/2Z)* and (Z/4Z)*). 
The class of 5 therefore generates a subgroup isomorphic to Z/2°~?Z, and 
—1 generates a subgroup of order 2 not contained in the former. Therefore, 
(Z/2°Z)* = (5) @ (-1) & Z/2°-2Z x Z/2Z. 
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2.5. Remark. The quaternion subgroup Hg = {+1, +i, +7, +k} is a finite 
subgroup of the multiplicative group of the division ring H but is not cyclic 
(which does not contradict Lemma 1-2.1 because H is not commutative). 


Applications. The previous statements allow us to find the number of 
solutions to the equation 2” = 1 in Fj or (Z/NZ)*, as well as the number 
of mth powers. This is true because in a cyclic group of order n, say 
G = Z/nZ, the number of elements which satisfy maz = 0 is equal to 
d := gcd(m,n): by making use of Bézout’s lemma, we can show that {x € 
Z/nZ | mz = O} is equal to {2 € Z/nZ | dx = 0}, and since d divides 
n, the latter set is the cyclic subgroup of order d in Z/nZ. By applying 
this to G = Fj or G = (Z/p°Z)*, we get the first part of the following 
proposition. 


2.6. Proposition. Let m be an integer > 1. 
1) We have the following formulas: 
-card{z € Fj | 2™ = 1} = ged(m,q—1); 
~card{x € (Z/p°Z)* | v™ = 1} = gced(m,(p—1)p% +) (for p odd). 
2) More generally, if N = pf! --- pe” is odd, 


card {x € (Z/NZ)* | 2” = 1} = [J ged(m, (pi — 12"). 


i=1 


Proof. The formulas in part 1) follow from the previous discussion and 
from the fact that Fj and (Z/p*Z)* are cyclic. Formula 2) follows from 
the previous formula and from the Chinese remainder theorem. This is 


because for all « € Z, e™ = 1mod N is equivalent to 2™ = 1modp;‘ for 
1l<icr. 


2.7. Remark. By considering the homomorphism «+> x”, we can easily 
see that 
_ q-1 
card F*” = card{z € F* | dye F*, x de . 
a { a 2y q yy gcd(m, g — 1) 


For example, if q is odd, we have (F% : F7?) = 2. 


3. Jacobi and Legendre Symbols 


In this section, we mainly concentrate on the study of squares, i.e., the case 
m = 2 of the preceding section. 


We begin with a remark. The map x +> 2? is an isomorphism from F2 


to F2 or more generally from Fos to Fos; in order to study squares, it is 
therefore natural to assume p # 2, and that is what we do. 
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3.1. Definition. We define the Legendre symbol for a € Z and p # 2 as 
follows: 
0 if a=O0modp, 


(+) := 4+1 if ais a non-zero square modp, 


—1 if ais not a square modp. 


3.2. Remark. It is clear that (+) only depends on a mod p, thus we will 
continue to use the same notation whenever a € F,. If (+) = +1, we say 


that a is a quadratic residue; if (+) = —1, we say that a is a quadratic 


nonresidue. 


3.3. Theorem. The Legendre symbol satisfies the following properties. 
(> )= (5) (e) 
Pp} \p P}- 


Gee = (+) mod p. 


i) For any a,b € Z, 


ii) For every a € Z, 


iit) For every p # 2, 


(>) = (-1)@-/2 Kia (2) = (-1)@?-D/8, 
In particular, —1 is a square modulo p (resp. not a square) if p = 
1mod4 (resp. p = 3mod4), and 2 is a square modulo p (resp. not a 
square) if p = +1 mod8 (resp. p = +3 mod 8). 

iv) (Quadratic reciprocity law) Let p and q be two distinct prime numbers. 
Then we have 


($) (F< 

Pp q 

Proof. The multiplicativity in part 7) is clear if p divides a or b, since then 
the two terms are 0. If a,b € F>, the formula comes from the fact that 


F, yee is of order 2, so the product of the two quadratic nonresidues is a 
quadratic residue. 


To prove ii), we observe that since (a'?-!)/?)? = a?-! = 1, we always 
have a‘?—)/2 — +1, and by Proposition 1-2.6, the subgroup H of elements 
satisfying a—))/? = 1 is of order (p—1)/2. In addition, the set of squares 
is a subgroup of order (p—1)/2. Furthermore, if a = b?, we can deduce that 
a(P—1)/2 — pp! — 1, hence F*” C H, and we have the desired equality. 


The first part of iii) follows from equality 7i). For the second part, we 
introduce a, a root of X++1 = 0; it is an 8th primitive root of unity in 
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an algebraic extension of F,, in other words a® = 1 but a’ # 1, which is 
equivalent to a4 = —1 and also a? = —a~?. If we set B := a+ a7, then 
B? =a*+2+a-? = 2; thus we see that 2 is a square in F, if and only 
if @ € F,. We know that @ € Fy, is equivalent to G? = (8, so we want to 
compute 3? = a? + a~”. By using the fact that a® = 1 and at = —1, we 
see that if p = +1 mod8, then 6? = @ so @ € Fp, whereas if p = +3 mod 8, 
we have 6? = —G and therefore 6 ¢ F,. We will postpone the proof of the 
quadratic reciprocity law iv) until the next section. 


3.4. Remark. To see where the choice of “3 = 2” comes from, notice 
that if ¢ := exp(27i/8) € C, then ¢ is an 8th root of unity and ¢ = 


2 ae, so¢+01=¢€46= V2. 


The Jacobi symbol is a generalization for odd N = p{'---p@ and is given 


by m ig 
8) = (BY) as 


Its main properties are stated in the following lemma. 


3.5. Lemma. For N,M odd: 
i) (#) = ( a ) ( b ) and (+) = 0 tf and only if gcd(a, N) > 1; 


N N N 
N-1 N?-1 
it) (+) =(-1) 2) and (=) =(-l) 8 ; 
(N-1)(M-1) 
tit) (+) = (-1) 4 (+) 


Proof. These formulas can be deduced from the analogous formulas for 
prime numbers M and N. Statement i) is clearly true. To prove ii) and 
iti), we write N = p,---p, (with possible repetitions), so that 


(5) = T(t) = anc 


with h being equal to the number of indices 7 where p; = 3 mod 4. Further- 
more, N = 3’ mod4, hence N = 3mod 4 if h is odd and N = 1mod4 if h 


ard te hmod 2. Likewise, 


is even; thus we have 


Ge) TIGR) = 0 = a! 


where fh is now the number of indices 7 with pj = +3mod8. In this case, 
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2 h 
we have N = +3” mod8. Therefore, a — = 3 — = hmod 2, which 


proves the second formula of iz). 


In order to prove assertion iii), we write M = q,---qs and N = p,---p, 
(with possible repetitions). If h (resp. k) is the number of indices i such 


that p; = 3mod4 (resp. g; = 3mod4), then eS aa garde hisead 


and is even if h is even (resp. is odd if k is odd and 
M—1 is even if & is even). In other words, N=1 = hmod2 and 
a ae 1 = kmod2. We can deduce from this that 


Ga © ITT (Gr) 7 NO ieee (2) 
(N-1)(M-1) 


9) ec (BY 


Statement iz) is Jacobi’s reciprocity law. The two properties provide an 
algorithm for calculating the Jacobi symbol. Pay attention however to the 
fact that the Jacobi symbol does not characterize squares modulo N (if a 


is relatively prime to N and a square modulo N, then (+) = 1, but the 
converse is not true when WN is not prime). 


As a first application of the quadratic reciprocity law, we will prove that if 
d is a square-free integer, the prime numbers which can be written in the 
form p = x? + dy? satisfy certain congruences modulo 4d. 


To be more precise, if d = ep, --- py (where € = +1) and p = x? +dy?, then 
p does not divide y because if so p would also divide x, and we could then 
conclude that p? divides p. Therefore, we know that —d = (xy~')? mod p, 
and if d is odd, then 
d F (-jEbae—vo-v/ (P P 
pee (tae = a ini (Pi—1)(p-1 Pais oer en (ees 
d (= )=Ca FC) . Ca (a): 
We therefore obtain congruences for p modulo 4p; --- px. If d is even, we 
p-l 


set p; = 2 and separately calculate (=) = (-1) 8 , thus obtaining 


congruences for p modulo 8p2--- Dr. 


3.6. Example. If a prime number can be written as p = x? — 6y?, with 
x,y € Z, then (2 = 1 and also 1 = (—1)*-9)/8(-1)@-0)/2 (4), which 


is equivalent to p = 1 or 3mod8 and p = 1mod3, or also to p = —-1 
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or —3mod8 and p = —1mod3. By the Chinese remainder theorem, we 
can then conclude that p = 1,5,19 or 23mod 24. Thus there is no prime 
number p = 7,11,13 or 17 mod 24 which can be written p = x? — 6y?. 


4. Gauss Sums 


Gauss sums are important in arithmetic; we are going to use them to give 
a proof (due to Gauss of course) of the quadratic reciprocity law. In the 
following section, we will use them to calculate the number of solutions 
modulo p of a quadratic equation. 


2714 


Observe that exp ( ) only depends on amod p, hence this expression 


is well-defined for a € F,. We will use the following formulas, and leave 
the proof of them as an instructive exercise. 


¥ ()= 5 (g)=0 am 


“reF, 
ye 2rizy \  }n_ ifn divides y, 
=O : ie ~ 10° if n does not divide y. 


The first example of a Gauss sum that we will look at is the following, 
where p is an odd prime and a is relatively prime to p: 


pol pas 
T(a) := S "exp (22 ) : 
«2=0 


4.1. Proposition. The sums T(a) satisfy the following formulas. 

i) (a) = (4) 700). 

it) |r(a)? = p. 

ney —1 

2) Es [os 

iit) T(1) = D ) p. 

Proof. If a is a square, then aF%? = F*? hence r(a) = r(1). Let a be a 
quadratic residue and 6 a quadratic nonresidue modulo p. 


p-1 . 92 p-1 12 
T(a) + 7(b) = S "exp ( 21g" ) + S "exp (2a ) 
xz=0 xz=0 
=24+2 ys exp (2a) 49 S- exp (20) 


ucaF*s? uebF*? 


=2')> exp (2H) =o. 


u€Fy 
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Hence we have 7(b) = —t(a) = —7T(1), which proves i). For the second 
formula, we can do the calculation in two ways: S~?~;|r(a)|? = (p — 
1)|r(1)|? which is also equal to 


5 ep (20) SO exp (20ieur) 


a=12,yEF, a=1u,veF, 
p-1 
= \op=p(p- 1), 
a=1 
and so we have formula ii). Finally, we know that 7(1) = 7r(-1) = 


(>) r(1), hence 7(1)? = (>) Ir(1)|2 = (>) p. 


2. Remark. Formula ii) allows us to deduce that if p = 1 mod 4, then 
T(1) = +,/p, whereas if p = 3mod4, r(1) = +i,/p. We can actually show 
(the proof is a little tricky, see Exercise 1-6.13) that it is always positive. 
For example, 


2 “% 
1lj= S "exp Gor 1+2exp ( au ) —142 (-4+i-3) =iv3, 
x=0 
Z Sew (282 _s dexp ( 201) | ep (—284) 
x=0 7 5 5 


= 144008 (2) =1+4(-L4%) - V5. 


We can express the sums in another way by proving the following lemma. 


4.3. Lemma. The following equality holds: 


ra) = 30 (g) ow (Ap) = Xo (pe (25). 


zeFp ceFs 


Proof. Notice that 1+ (4) is equal to the number of solutions in F, to 
the equation y? = x. This gives us: 


Y (Foo (45) = E 0+ (F)) oe (75) 


ceF, veF, 


S> exp ( 7 ) =7(a), 


yeF, 


as desired. 
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This leads into the first generalization. We define a character as a homo- 
morphism x : F7, + C*. We generally refer to the constant function, equal 
to 1, as a unitary character, a principal character or even a trivial character; 
it is denoted by yo. By convention, we extend each character to all of F, 


by x(0) := 0 if x A xo and x0(0) = 1. 
Therefore, for a relatively prime to p, we let 
4 2riax \ _ 2r1ax 
aoa > x(x) exp (72H) = x x(a) exp (2002 ) 
reFy, xeFs 


and prove the following. 


4.4. Proposition. The sums G(x, a) satisfy the following formulas. 

i) GO @) = x(a)G(x, 1). 

it) |G(x,a@)|? =p (if x is not a trivial character). 

Proof. For the first formula, notice that x(a) is a root of unity (since 
x(a)?-! = x(a?-1) = 1), and therefore x(a~!) = x(a)~! = X(a). This 
yields 


G(x,a) = S> x(x) exp ( 222 ) 


ceFr 


=x(a-¥) SP x(ae) exp (2882) = y(a-)G (x, 1). 


ceFs 


For the second formula, yas IG(x,@)|? = (p — 1)|G(x, D[? and is also 
equal to 


ee (=) 


a=1z2,yeEF, 
=S> YT xa)xy) ep ( ua) ) S> xe)xv) 
a=0 x,yEFp z,yEeFy 
=p >> x(x)x(z) = p(p— 1). 
xeF, 


The last formula can be deduced from the equation: 


G(x, 1) = G(x -1) = x(-1)G(, ). 


To prove the quadratic reciprocity law, we will introduce the analogue 
of these sums in finite characteristic. More precisely, if p and q are two 
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distinct odd prime numbers, we choose a primitive pth root of unity, a, in 
an extension of F,; namely a is a root of the equation 


ah 4 ah? 4... +a41=0. 
We then define the “Gauss sum” in F,(a) by 


and prove the following lemma. 


4.5. Lemma. Let 7 be the element of Fg(a) as above. Then 
i (+) P; 
Pp 
Dp eS (4) € F(a). 


Proof. We calculate 


Pep) ae 


L,yeF, ueF, 
where S(u) = Voiyou Ur) = = ee For u = 0, we 
have S(0) = Doser, (= ~) = ( 1). For u € Fj, the sum S(u) 
equals 
y ue obie +) 
aeF ( d aeF* x 
—1 1—uxc-! 
a > ( D ) 
xeFs 
=(3)) Xs) 17. 
ycFs 
in other words, S(u) = — (+). In fact, 1 — ua—t takes all values in F, 


f= (F)o-1-Le=(F)p 


For the second formula, since the characteristic is q which is odd, it follows 
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ie ue AC oP NG ea 


zeFp zeFy zeF, 


By using the fact that + 4 0, assertion 2) follows from assertion 1). 


Proof. (of the quadratic reciprocity law) We saw that if q does not di- 
vide a € Z, then aQ@—))/2 = (+) modq. Therefore, by applying this to 


a= p, we obtain the following equalities in F,(a) by successively invoking 
formulas 1) and 2) of the preceding lemma: 


= (q-1)/2 
PY) _ (q-D/2 — (>) = 
( q ) p a 
= (—1)@-G-Y)/4,4-1 = (1) -YG-1)/4 4) 
(-1) rt = (=1) (4). 
This yields the following equality of signs, first in F,, then in Z: 


(4) = (-1)@-DU-D/4 (+) 


which finishes the proof. 


Other proofs of the quadratic reciprocity law are proposed in Exercises 
1-6.13 and 2-7.14. 


5. Applications to the Number of Solutions 
of Equations 


We will now explain another application of Gauss sums (and other elemen- 
tary theorems) to finding the number of solutions of equations in Fg or 


Z/NZ. 
5.1. Theorem. (Chevalley-Warning) Let k = F, be a finite field of 
characteristic p. If P € k[w1,...,4n] and deg(P) <n, then 
card{z € k” | P(x) = 0} =Omodp. 
In particular, if P is homogeneous of degree d < n, then P has a nontrivial 


zero (i.e., distinct from 0). 


We will start by calculating the sum of values of a monomial. 
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5.2. Lemma. Let x := a{''---x7'" be a monomial. Then do eqn v™ is 
zero except when every m; is non-zero and divisible by (q—1). In particular, 
this sum is zero as s00n as my +++: +My < (q—1)n. 


Proof. Let us point out that since the polynomial “X°” is the constant 
polynomial, it follows naturally that 0° = 1. The calculation 


See ate =(Sar) (Sar) 


rEk” (ages Bn )EkK” 21Ek in€k 


brings us back to the case of one variable. If m = 0, then yeh = gly = 
0. If m is not divisible by g—1, take yo to be a generator of k*, so yg” £ 1, 


and therefore, 
Soy” = do(you)” = Soy” 


yck yck yek 
yields yc, y” = 0. 
Proof. (of the Chevalley-Warning theorem) We can deduce from the lemma 
that if Q € k[a1,...,2n] and deg(Q) < (q—1)n, then )) yn P(x) = 0. 
Now let P be the polynomial in the statement of the Chevalley-Warning 
theorem. We will apply the previous result to Q = 1 — P%~!. Notice that 
deg(Q) = (q — 1) deg(P) < (q—1)n and that Q(x) = 1 if P(x) = 0, while 
Q(z) = 0 if P(a) 4 0 and w € k”. It follows that in k, we have the 
equality 


0= 4° Q(z)= So 1=card{x € k” | P(x) =O}, 


ek xEk” 


P(«)=0 


which completes the proof since k is of characteristic p, and hence m1; = 0 
is equivalent to m = 0 mod p. 


5.3. Definition. If Q(x) = iicjj<n Ujvitj is a quadratic form where 
Qij = Aji, We say that it is nondegenerate if Dag := det(ai;) 4 0. 


5.4. Remark. If we do not impose the symmetry condition aij = jj, 
we can (if the characteristic of the field k is not equal to 2) replace Q by 


Q' (x) = eer b,;2;@;, where bj; := F (a3 + a;;), in such a way that 
for all 2, we have Q(x) = Q’(x). In general, the study of quadratic forms 
in characteristic 2 is more subtle, and we will therefore avoid it. 


We start by showing that if the characteristic of the field k is not equal to 
2, then we can replace Q by a diagonal form Q’(y) = ayy? +--+: +any?. We 
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write Q(x) = ‘rAx where A is symmetric; if we introduce the symmetric 
bilinear form B(x, y) = ‘vAy, it follows that Q(x) = B(x,x) and B(z,y) = 


1 Q(a4 Q(x) -Q . Let F be a vector subspace of &”, and let 
5) y y 


Ft :={reEk" | Vy € F, B(z,y) = 0}. Then we have dim F+dim Ft = n. 
To see why this is true, we take a basis for F, e1,...,¢,, and let ®(x) = 
(B(e1,2),..., B(e-,x)). The kernel of the linear map ® : k” — k" is F+, 
and its image is all of k” because if not there would exist a1,...,a,;, all 
non-zero such that 0 = a, B(ei,2) +--- + a,B(e,,x) = Baye; +--+ + 
@,€,,£), which contradicts the hypothesis that B (or Q) is nondegenerate. 
It therefore follows that n = dim Ker ® + dimIm ® = dim F + dim F+. 


We now prove by induction on n that there exists an orthogonal basis. 
Choose e; such that Q(e,) 4 0, so k” = (e1) © (e1)+, and we can proceed 
inductively since dim(e,;)+ = n — 1 and since the form remains nondegen- 
erate when we restrict it to (e,)+. Now, if e1,...,@n is an orthogonal basis 
such that Q(e;) = aj, and we denote by y1,...,Yn the coordinates of the 


vector (21,...,2n) in the basis e1,...,e@,, we have that 
Q(x1,---;2n) = Q(yier + +++ + Ynen) = ary? + +++ + any? 


Let us point out that if we call the quadratic form on the right Q’ and 

the change of basis matrix U, then Dg = det(U)?Dg. In particular, if we 

Da 
Pp 


De 
work over F,, then we have oF = 


the proof of the following theorem. 


) This remark is used in 


5.5. Theorem. Let Q be a nondegenerate quadratic form in n variables 
with coefficients in F, (p #2). Then 


card {x € (Fp)" | Q(x) =0} =p"! + €(p—1)p2', 
where 
0 if n is odd, 
= | n/2TZ) 
: (re | if n is even. 


Proof. From the remarks before the statement of the theorem, we can 
assume that the form Q is diagonal, in other words, Q(x) = aya? +--+ + 
anv. Let N be the cardinality that we want to compute. We have 
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wey S> exp (=) 
= 9+ 5° exp ( HO) 


p-l1 : 2 2 
on a a 
no S - ‘ ( ta(a1x{ AnX;,) ) 
eF 


a=lj=12;€F, a=l j=l 
p-1 
ee OPS n a,°** An a ie 
eee ae) ( P ) ca) 
a=1 
n 
Now, @1...dn = Dg, and the sum 3~?—} (+) is 0 (resp. p—1) if n is 


odd (resp. if n is even). It follows from this that N, = p"~' if n is odd. If 
n is even, observe that 


(ay = (r(ayey"? = (Sh) pr 


and we have the formula for Np. 


5.6. Remark. This statement gives us a much more precise formulation 
of the Chevalley-Warning theorem in the case of quadratic forms. This 
is obvious if the quadratic form is nondegenerate; we should add that a 
degenerate form can be written, after a variable change, as Q(21,...,2n) = 
ayxi+:+-+a,x? with r <n and Dg = 4... #0. In this case, it follows 
that 
Tr big 
Np =p"" (or +el(p— ip?) =p” '+e(p—lp™ 2, 
(-1)"? Dg 


where now e is zero if r is odd and is ( p 


) if r is even. 
We will now consider a quadratic form Q(x) = diy <j j<n UijVit; with in- 
teger coefficients. If we want to count the number of solutions modulo N 
where JN is not necessarily prime, we can rely on the two following lemmas 
(where the first is a variation of the Chinese remainder theorem). 


5.7. Lemma. Let waQ(N) := card{xmod N | Q(x) =OmodN}. If M 
and N are relatively prime, then ~g(MN) = Ye(M)va(N). 
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Proof. This is a corollary of the Chinese remainder theorem: Q(x) = 
Omod MN if and only if Q(z) = OmodN and Q(z) = Omod M, and 
furthermore, each pair of congruence classes x = amod M, x = bmod N 
corresponds to a congruence class mod MN. 


This lemma reduces our case to counting the solutions modulo p™. This can 
be done thanks to the following lemma, which is a special case of “Hensel’s 
lemma”. 


5.8. Lemma. Let p be an odd prime number which does not divide Dg. 
We define the set of “nonsingular” solutions mod p™ by 


Cq(p”) := {cmodp™ | Q(z) =O0modp” and « #0modp}. 
Then we have the formula 


card €Q(p"”) = p\™-D™-) card €a(p) 


Proof. The second equality is an immediate corollary of the first equality 
and of the preceding theorem. We have an obvious map from @Q(p”*') 
to Gg(p™), which sends an n-tuple of integers modulo p™*t! to the same 
n-tuple of integers modulo p™. It is enough to show that this map is 
surjective and that each fiber has order p”—! 
card (p™t') = p"~! card €g(p™), and the lemma follows easily from 
that. So let ag be an n-tuple of integers such that Q(vo) = 0modp™, or 
such that Q(x) = pao. We know that 


Q(xo + pz) = Q(x) + 2p™B(a0, 2) + vp" Q(z) 


= p™ (ap + 2B(2x, z)) mod p™*1, 


since we would then have 


m+1 


which is zero modulo p if and only if 


ao + 2B(xo, z) = 0 mod p. 


Since 29 4 0 mod p and B is a nondegenerate bilinear form, this last equa- 


tion is the equation of an (affine) hyperplane in F/;; there are therefore 


exactly p”—! solutions modulo p at. z. 


Generalization. The calculation done on the quadrics can now be gener- 
alized by considering, on the one hand, the solutions over F, and, on the 
other hand, forms of arbitrary degree (restricting to diagonal forms). We 


20 1. Finite Structures 


therefore consider solutions 7 = (#1,...,%n) € FG to the equation 
ayxt +++» tana? = 0. (1.4) 


It will be useful to provisionally introduce the trace and the norm, but we 
will give a more general definition in Chap. 3 (Definition 3-4.8). 


5.9. Definition. Let q = p™ and x € Fy. We define the trace (resp. the 
norm) of F, over F, as 


F m—1 F ceyma-1 
Trp? t= +a? +--+ +2? and Npices a terete. (1.5) 


One can easily check that these maps send F, to F, and that the trace 
is F,-linear (resp. the norm, multiplicative). We first use the trace to 
construct an additive character: if g = p™ and a € F,, we define it by and 
denote it as 


q@ 
Qi Trp! a 
(a) := exp = (1.6) 


Now we can generalize the calculation over F,. 


5.10. Lemma. Let b €¢ F,. Then we have the formula 


q ifb=0, 
og wia)= {" faahi (1.7) 
sen, i ‘ 

Proof. The formula is obviously true for b = 0. If b # 0, the map 
art Trp (ab) from F, to F, is F,-linear and surjective. Therefore, ev- 


ery element of F, appears p™~' times in the image of the trace, and hence 
yoaeF, w(ab) = p™1 ner, exp(27ia/p) = 0. 


Convention. The unitary character xo is defined by yo(a) = 1 for every 
aé Fy. 
If x : Fj — C* is a character (i.e., a homomorphism), other than the 


unitary character (over F7), we extend it by x(0) = 0. 


We can therefore define the corresponding Gauss sums for a € F%. 
Gx vsa) = So x(x)b(ax) and G(x, H) = G(x, 4,1). (1.8) 
aeF, 


We then have a proposition analogous to Proposition 1-4.4 (and leave the 
proof as an exercise). 
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5.11. Proposition. We have the following formulas. 
i) G(xo; Y, a) = 0, 

tt) G0 %, 4) = X(@G(x,¥), 

it) |GOY)| = Va (if x F Xo). 


Let us now come to the calculation of 
N := card {(a1,...,¢n) € (Fq)” | F(a) = az? +++ +anz4 = 0}. 


We first point out that 


N= 32 YD WlaF(2)) 


a€Fy, xe(Fy)” 


= + > YS vWarF(e)) 


ack? «€(F,)” 


=a" t+ OTT dS vlaajy*) 


a€F* j=l ye(F, 


= q” 7 S- II T(d, aa;), 


acFks g=1 


where we let T(d,a) := Dyer, w(ay?). The key step in the calculation is 


the following observation. 


5.12. Lemma. [/f d’ = gcd(d,q— 1), then T(d,a) = T(d',a). Suppose 
d divides q—1. If Ga denotes the set of the d characters x which satisfy 
x7 = yo, and we let G’, = Ga \ {xo}, then we have the equality 


T(d,a)= D> X(a)G( ¥). (1.9) 


xEG 


Proof. Let us point out that we must first understand the equality y7 = yo 
as saying: Va € F%, x4(x) = xo(z) = 1. This is because if y € G’), then 
x(0)4 =0 41 = yo(0). The first assertion is an immediate consequence of 
the fact that F% is cyclic of order q — 1, thus FY = Bae We then check 


that with the hypothesis that d divides q — 1, we have 


d ifze F*4, 


So x(a) =41 ife=0, (1.10) 


x€Ga 0 if not. 
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We can then deduce that 


a)= D7 v(ay*) = D0 YS x@dlat) = SF x(@G(x, 2) 


yeFy teF, x€Ga xEGt, 


since G(xo, q) = 0. 


5.13. Theorem. Let d divide q—1, and let Sq be the set 2 n-tuples of 
characters (X1,.--;Xn) such that x; # xo, xe = Xo and y1---Xn = Xo- 
Then the number of solutions of the equation aya¢+---+anxt = 0 is equal 
to 


Nag 4 4o* pala) Ra(an)@Oe, ¥) -- Gn ¥)- 


(1.11) 


Proof. Observe that ack; x(a) equals g — 1 if vy = xo and equals zero if 
x # Xo. It follows from the previous calculations that 


qN=q"? + » [[ T(@aa;) 


acFs gal: 


=q"t+ S- S- nee aa;) G(xj,¥) 


QEFS X15-Xn EGY J=1 


=q"+(q-1) S- That (a;)G(xj,¥). 


(X1yXn)ESa I=1 


5.14. Example. We can prove by induction or a direct calculation that 
the cardinality of Sg equals 


s(n, d) = + ((d—1)" + (-1)"(d- 1). 
1) 


Therefore, N = g’~!+(q- 


Rwhere R is the sum of the |Sq| terms whose 


absolute value equals go; For d = 2 we find that s(n,d) is zero for n 
odd and is 1 for n even; if n = 3, we find that s(3,d) = (d— 1)(d — 2). 
For example, for a cubic equation apxvg + a12? + agv3 = 0 over F, where 
q = 1[3] and by letting x be a character of order 3 (the other one being 
x? = Y), we have 


N=¢-(q-1)(a+4), 
where a := — x (a9a142)G(x, 2)? /¢. 


It is interesting to see how this number varies when we choose a tower of 
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finite fields. A key result in this direction is the Davenport-Hasse theorem, 
which connects the different Gauss sums. 


5.15. Theorem. (Davenport-Hasse) Let F, be a finite field and Fgm a 
finite extension. We denote by Tr = stl and N = Ne 
the norm. If x is a character of Fj, we have the relation 

— G(XON, Ho Tr) = (-G(x,#))”. (1.12) 
Proof. See [5] (Chap. 11, Sect. 4) or Exercise 1-6.25. 


the trace and 


6. Exercises 


6.1. Exercise. Show that in a commutative group, if the order of x1 1s 
d,, the order of x2 is dz and d, and dz are relatively prime, then the order 
of %12%2 is djdz. Show also that in a cyclic group, if the order of x, is dy 
and the order of x2 is dz, then the order of the subgroup generated by x, 
and x2 is equal to the LCM of d, and do. 


6.2. Exercise. Prove that if the class of x € Z generates (Z/p?Z)* then 
it also generates (Z/p°Z)* (for odd p). 


6.3. Exercise. Prove that if N is even and m is odd, the last formula of 
Proposition 1-2.6 is also true. How should you change the formula when 
both N and m even? 


6.4. Exercise. Let K := Fygm and k := Fy. Prove that the maps N = 
Ne: K* > k* and Tr= Tr : K > k are surjective. 


Prove that KerN = E. and that Ker Tr = {24 — a2 | a € Fan}. 


6.5. Exercise. If b is the base of a numeral system, (i.e., an integer > 2), 
every real number can be written as an expansion in base b: 

0, 2142 ...An...= a9 tayo 4+---+anb+..., 
with ag € Z andO <a; <b-1. 


1) Prove that this expansion is unique, except for the case where an, < 
b—1 and a, = b—1 for every n > no, in which case ag, a1a2...An... = 
ag, a1aQ... (Ano + 1)000 saan 


2) Ifa/c € Q, show that the expansion of a/c in base b is ultimately periodic 
(i.e., it is a repeating decimal) and interpret its period. 
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6.6. Exercise. Use (n!)?+1 and (n!)?—1 to prove that there exist infinitely 
many prime numbers congruent to 1 modulo 4 (resp. to —1 modulo 4). 


Use 5(n!)? —1 to prove that there exist infinitely many prime number con- 
gruent to —1 modulo 5. Use 2(n!)? —1 to show that there exist infinitely 
many prime number congruent to —1 modulo 8. 


6.7. Exercise. An integer N is said to be a Carmichael number if N is 
not prime and aN-! =1mod N for every a relatively prime to N. 


a) Show that N is a Carmichael number if and only if N is square-free and 
for every prime factor p of N, p—1 divides N —1. 


b) Show that if 6m+1, 12m+1 and 18m+1 are primes, then their product 
is a Carmichael number (for example: N := 7-13-19). 


6.8. Exercise. Let M := 21560 = 23-5-77-11, N := 21576 = 23-3-29-31 
and Gy := (Z/MZ)*, Gp := (Z/NZ)*. 


a) Do the groups G, and G2 have the same order and are they isomorphic? 


b) Calculate the exponent of the group G1, in other words the smallest 
integer m > 1 such that if a is relatively prime to M, then a™ =1modM. 


c) How many solutions are there to the equation x2 =1 for x € G1? 


d) How many solutions are there to the equation x? = —1 for x € G,; same 
question for x? = 9? 


6.9. Exercise. Let L := 11396 = 27-7-11-37, M := 16200 = 2? . 34.5? 
and N := 13176 = 23-33-61; and let Gy := (Z/LZ)*, Gz := (Z/MZ)* and 
G3 := (Z/NZ)*. 


a) Are the orders of the groups G; equal and are the groups isomorphic? 


b) Calculate the exponent G‘;, in other words the smallest integer m > 1 
such that if a is relatively prime to L (resp.M, N), then a” = 1modL 
(resp. rod M, mod N). 


c) How many solutions does the equation x? = 1 have in Gi, G2, G3? 


d) How many solutions does the equation a’! = 1 have in Gy; same 


question for aN-! =1 in G3? (Notice that L —1 = 11395 = 5-43-53 and 
N —1=5?-17-31.) 


6.10. Exercise. Calculate the number N(a,b,p) = N(a,b) of solutions 
(x,y) € (F,)* of the equation ax? + by? = 1. 


Hint.— You could repeat the steps of Theorem 1-5.5 (or apply the theorem 
to the conic ax? + by? — z? = 0) for the equation ax? + by? = 0 and then 
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finish from there. A generalization, as well as a different approach, is given 
in the following exercise. 


6.11. Exercise. (Jacobi sums, see [5]) Let p be odd and let y1,...,Xn: 
Fy, — C be characters. We define the Jacobi sum by 


JX 00+) Xn) = Se X1(£1) ---Xn(n)- 
ayte-+an=1 
We also denote the principal (trivial) character by xo. 


1) Prove that Jacobi sums can be factored with the help of Gauss sums in 
the following manner. If x; are all nontrivial and x1 +--Xn # Xo, then 


J(X15+++5Xn) = eS! 


and in particular 
n—-1 


lJ(X1s-+sXn)l=p 2 

2) Let Na(u) := card{x €F,|2?=u}. If d' = gcd(d,p —1), prove that 
Na(u) — Na (u). 
3) Suppose that d divides p—1. Recall why the following formula holds: 

Nau) = D> x(u) 

x€Ga 
(where Gq is the set of characters such that x4 = yo). 
4) Let N := card {x € (F,)” | aja" +--+ +an29" = b}. Prove that 
N= S- Na, (#1) es Na, (£n); 
a-xv=b 


where a-X% = @,%1 +-+:+@n%y. Deduce from this that the number N does 
not change if we replace d; by gcd(d;, p — 1). 

5) Keeping the same notation, if d,,...,d, divide p—1 and b 4 0, prove 
that 


N=p™ + S- X1° ++ Xn(b)X1 (a1) +++ Xn(an) IO, - +5 Xn), 


where S denotes the n-tuples of characters (x1,.--,Xn) such that x; # Xo, 
but x5? =Xxo. 


6.12. Exercise. We define a character as a homomorphism x : (Z/nZ)* — 
C* that we extend by convention to all of Z/nZ by x(x) := 0 if x is non- 
invertible. We say that xy is primitive if it does not come from a character 
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modulo m, where m is a nontrivial divisor of n, in other words if we cannot 


factor y : (Z/nZ)* — (Z/mZ)* > C*. Let 
G(x,a)= 30 x(a)exp (782) = SD x(a) exp (2844). 
LEZ /nZ xe(Z/nZ)* 


Prove the following formulas where a is relatively prime to n and x is 


primitive modulo n. 

i) G(x, @) = X(@)G(x, 1). 
ti) |G(x,@)|? =n. 
it) G(x, 1) = x(-1)G(x, 1). 


6.13. Exercise. In this exercise, you are asked study and calculate the 


SUMS 


a) If N=2M with M odd, prove that G(N) = 0 (divide the sum into the 
terms from 0 to M —1 and the terms from M to 2M — 1). 


r—l1 


b) Let p be an odd prime. By decomposing x = y+ p’*z with y modulo 


r—1 


p and z modulo p, prove that G(p") = pG(p"~?); conclude then that 
G(p*") = p" and G(p*"*") = p"G(p). 


c) We introduce the function $(x) := f(x) + f(@+I4+---+ f(@+N-1), 
oe) 
where f(x) := exp ( 2uiz* | on the interval [0,1]. Let 


1 
On = i o(t) exp(—2aimt)dt 
0 
be the Fourier coefficient of 6. Check that 


Gy di 


neZ 


d) From this, deduce the equality 


G(N) = (146°) i en ( are ) d= 0+ WNC, 


—Co 


and compute the constant C' by choosing N = 1. Now conclude from this 
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that 
VN if N = 1mod4, 
G(N) = i x leita 
(1+iVN if N=0mod4, 
0 if N= 2mod4. 
_ N-1 Qriax? 
e) We now introduce G(a,N) := Soyo exp ae Prove that 


G(a, MN) = G(aM, N)G(aN, M) if gcd(M,N) = ged(a, MN) = 1, then 
that if N is odd, we have G(a, N) = (+) G(N). Conclude from this that 
for M,N relatively prime and odd 


G(MN) = (+) (+) G(M)G(N), 
and deduce the quadratic reciprocity law from this formula and from the 
previous question. 


f) More generally, if ged(2a, N) = 1, calculate the sum 


N-1 ; ‘ 
2 b 
G(a, b,c, N) := S- exp ( mi(ax a x +c) | 


x=0 

6.14. Exercise. 1) If p is prime anda € Z, we let N(a,p) := card{(z, 
Y,z) € FS | 27+ y2+ 22 = amodp}. If p is odd, prove that N(a,p) = 
p+ (=) p. What is N(a,2) equal to? 
2) Let p be an odd prime. Assuming that N(p,7) = 42, calculate N(7,p). 
3) Let p be a prime number such that p= 3mod4. Calculate 

M := card{(z,y,z) € F? | a +y* + 24 =1modp}. 
6.15. Exercise. By a similar method, prove the following generaliza- 


tion of the Chevalley-Warning theorem (Theorem 1-5.1). Let P,,...,P, be 
polynomials of degree d,,...,d, with dy +---+d, <n. Prove that 


card{a € k” | P\(a) =--- = P,(x) = 0} = Omodp. 
In particular, if the polynomials are homogeneous, then they have a common 
nontrivial zero. 


6.16. Exercise. We consider the quadratic form given by 


Q(z, y, 2,t) =e 2xry 4 By” + 327 + 7H. 
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How many solutions does the equation Q(x,y,z,t) = 0 have modulo 5? 
Same question modulo 7? 


6.17. Exercise. We denote by N,,, the number of solutions x,y € Fam of 
the equation y? + y = x°. Prove that if m is odd, Nm =2™, and that if m 
is even, Nm = 2 — (—1)™/2Q1+m/2, 

Hint. The case where m is even is more subtle. One way is to introduce 
the sums R(a) = dyckom w(a(y? + y)) and S(a) = ee w(ax?) and 
conclude that Nm = 2™ +2°™ "19 R(a)S(a). The sums S(a) can be 
calculated as in the proof of Lemma 1-5.12, with the help of the Davenport- 
Hasse relation (Theorem 1-5.15), and then show that R(a) = 0 except for 
R(1) = 2™ before finishing the proof. 


6.18. Exercise. (Kloosterman sums) We define the following sum of 
exponentials: 


Qni(ax + ba! 
stand Tap eet), 
x€(Z/qZ)* 


1 


where, by convention, x~' is an integer (modulo q) such that x-tx = 


el 
1modq. (Notice that, with this convention, exp ( 2zian | 
27a 
exp (20 e 
We will use the Weil inequality (see Chap. 6, Formula 6.11): 
|S(a, b, p)| < 2p, 


which is valid for any odd prime number p which does not divide ab. We 
denote by e(z) = exp(27iz) and e,(z) = exp(2niz/q) and also 


Le. at 


x mod q xeEZ/qZ amod*q wx€(Z/qZ)* 


so that we can also write S(u,v,q) = >> €g(uxz + vx"). 


x mod* q 


1) Prove that the absolute value of the sums S(a,b,q) with respect to the 
“root mean square” is approximately \/q, or to be more precise, that 


S- |S(a, b, ql _ o(q)q. 


a,bmod q 


Therefore, on average, the size of |S(a,b,q)| is \/¢(q), or approximately 
VG. 


The point of this exercise is to fix an upper bound on the individual sums 
by using the result due to Weil cited above. 
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2) Prove that these sums can be factored and reduced to the case q = p™: 
if qd =Nq@ where gcd(q,q2) = 1, a = qaai+qiag and b = qobi + qibe, then 


S(a, b,q) = S(a1, b1, @1)S (a2, ba, G2). 


3) Prove that if a= p"ag and b = p"bo, then 


24 
S(a,b,p") = S- € ( Ne ) =p" S(ao, bop"). 


m—h 


Pp 


a mod* p™ 
(Which implies that we can reduce to the case p {gcd(a, b).) 
4) Show that ifm/2<n<m and p does not divide y, then (y+p"z)"1 = 


y + —p"y-?zmodp™. Now suppose that m = 2n +1 (or more generally 
m < 3n) and that p never divides y. Show that (y+ p"z)~! = yt - 
pry 2z + pry 3x7 mod p™. 


5) Prove that if m/2<n<m and p does not divide gcd(a, b), then 
|S(a,b,p™)| < Ap”, 


4 ifp=2andm—-n> 
with A := fp Oe VS 
2 if not. 


Hint.— Decompose the sum over xmod* p™ into x = y+p"z, with ymod* p” 


and zmodp™—", and take A to be an upper bound for the number of solu- 
tions to the congruence a — by~? = 0modp 
6) Deduce from the previous calculation that S(a,b,p™) = 0 if m > 2 and 
p divides a but not b or vice versa. 


7) If p is odd and m even, prove that if gcd(p, a,b) = 1, then 


|S(a,b,p™)| < On, 


m—-n 


Hint. Using question 6), reduce to the case where a and b are invertible 
modulo p, and apply the result from question 5) with n := m/2. 


8) Let p be odd. Prove that 


Ss at? ht ) prta if p” divides h but not a, 
e — — 
PR pee 0 if p” does not divide h. 


tmod p"+1 
Hint.— If p” divides h (anda #0modp), we would bring in a Gauss sum, 


if not, we would decompose the sum over t = r+ ps with rmodp and 
smodp”. 


9) Let p 4 2 be a prime which does not divide gcd(a, b) and let m be an odd 
number. Prove that 
|S(a,b,p™)| < 2p"? 
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Hint.— [fm = 2n+1, write x = y+p"z with ymod* p” and zmod p"*! in 
the sum and use the preceding question. 
10) From this, deduce the following theorem. 


Theorem. (Weil, Estermann) The following estimates hold, where we 
denote by w(q) the number of distinct primes which divide q and by d(q) 
the number of divisors of q. 

i) If gcd(2ab, g) = 1 then 


|S(a, b, q)| < 2° 4. 


ii) In the general case, we have 


|S(a,b,q)| < d(q) ged(a, b, g)!/2q'/?. 


6.19. Exercise. Let p be a prime number and F € Z|X1,..., Xn] be a ho- 


ial; (er OF 
mogeneous polynomial; we denote by VF (x) = ( aX, (Gi) 3casy aX, (2) 


and assume moreover that VF(x) = Omodp only if cx = Omodp (we say 
that F is “smooth modulo p”). We define the following sum of exponentials 


haa 2) 


«mod q 
where the sum is over x € (Z/qZ)” and gcd(a,q) = 1. 
1) Whenever q = q1q2 with gcd(q1, q2) = 1, find a, and ag such that 


S(a, q) = S(ai, m1) 5 (a2, qo). 
2) Check that F(y+p™-1!z) = F(y) + p™ !VF(y) -zmodp™. 


m—1 


3) Let m > 2. By transforming the sum S(a,p™) over x =y+p™*z into 


a sum over ymodp™—! and zmodp, prove that 


S(a p”) = p”(4-1) S(a, p™-4) afm 2 d, 
; prim—-1) ifm<d. 


4) By using Deligne’s upper bound, which says that whenever F is smooth 
and of degree d where d is relatively prime to p, we have |S(a,p)| < Bn,ap"/? 


see Chap. 6, (6. , prove the following upper boun 
Chap. 6, (6.10 he foll g bound 
1 

n\| 1—-— 
|S(a,g)| < CP ( a), 


where w(q) is the number of primes which divide q and where C is a constant 
which only depends on F. 
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6.20. Exercise. Let N, =) {(5s z,t) € (F,)4 | az* + byt 4+ 274+P? = O}}. 
Assume also that ab 4 0. 


1) Prove that if p= 3mod4, we have 


N. = pt+p*—p ifabe Fy, 
p—p+p ifabe F*\ F*. 


oes p?+3p?-3p if—a/be Fs, 
Ppp? +p if-a/be FA \ F*t. 
Hint. — By following the procedure in the proof of Theorem 1-5.13, show 
that 


PNy = pt + (p= 1) 1 (2) 14 + PEG (x(6/a) + x(a/B))} , 


where 7 is the Gauss sum associated to the Legendre character (of order 2) 
and xy is one of the characters of order 4 (the other one being X). 


3) Finish by finding N, if p =2 or if ab=0. 


6.21. Exercise. We will now try to find integer solutions (x,y) of the 
equation x? + 15y? =m, denoted (EG). 


1) Let p be a prime number # 2,3,5. If p divides m, prove that either p 


divides x and y and then p? divides m, or (=) = 1, 


2) Let p be a prime number # 2,3,5. Deduce from this that a necessary 
condition for the equation (&,) to have a solution is that p must belong to 
certain congruence classes modulo 15, and specify these classes. 


3) Does the equation x? + 15y? = 77077 have an integer valued solution 
(notice that 77077 = 77-117-13)? 


6.22. Exercise. In this exercise, you are asked to calculate, for each prime 
number p # 2,17, the number Ny := card{(z,y,) € (Fp)* | 2y? = x* — 17}. 


1) Calculate the numbers Ly := card{(x, y, z) | 2y? = x? — 1727} and use 
this to calculate M, := card{(zx, y, z) | 2y? = x? — 17}. 


2) Whenever p = 3mod4, prove that N, = Mp and, as a consequence, that 


_ jptl ifp=3mods8, 
a p-1 ifp=7mods. 
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3) As usual, let e(z) := exp(2miz) and let 


T(a) := S- e(ax*/p) and p(a):= S- e(ax*/p) . 


«eF, xeFp 
Prove that 
p-l 
Np =p-+p7! ¥e(17a/p)r(2a)p(—a). 
a=1 


From now on, we assume that p= 1mod4. We introduce G = {xo0, X1, x2; 
x3}, the set of characters of F* such that xo(x) = 1 and x*(x) = 1 for 
xz €F%. We extend them to F, by the convention xo(0) = 1 and x; (0) =0 
for j = 1,2,3. Suppose that x, is the Dirichlet character y1(a) := (z). 
We also introduce the associated Gauss sums: 


G(x,a):= $7 x(a) e(ax/p) and G(x) := G(x, a). 


reF, 


4) Recall why G(xo0,a) = 0, G(x, a) = X¥(a)G(y) and also that if x F xo, 
then |G(x)| = yP- 


5) Prove the formula 
p(a) = X1(a)G(x1) + X2(a)G(x2) + X3(a)G(x3). 


6) Using this, find a formula for Np in terms of Gauss sums of the form 
T(1 
Ny = p—€o + a (€1G(x2)? + €9G(x3)”) : 


where |e;| = 1. 


7) Conclude that N, > 1 for every p 4 2, 17. 


6.23. Exercise. In this exercise, we ask you to prove that the equation 
Qy? = xt —17 


has solutions modulo N for every N, but does not have any rational solu- 
tions over Q. 


1) Assume that there exist x = a/b and y = c/d, which are a solution to 
the equation, with a,c € Z, b,d € N* and gcd(a, b) = gcd(c,d) = 1. Prove 
that b* divides d? and that d? divides 2b* and deduce from this that d = b? 
and 2c? = a* — 170+. 

2) Let p 4 2 which divides c. Prove that p is a square modulo 17, and 
deduce from this that c itself is a square modulo 17. Conclude then that 2 
would be a fourth power, which is a contradiction. 
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3) Let p# 2,17. It was proven in the previous exercise (Exercise 1-6.22) 
that there exist u,v € Fj, where Qu? = vt — 17. By using Lemma 1-5.8, 
prove that the equation in question has solutions modulo p” for every n. 


4) Prove that the equation also has solutions modulo 2” and modulo 17”, 
by refining the previous argument. (You might want to use that 2-5? = 
24 mod 17 and 3+ — 17 = 0mod 2°.) 


5) Using the Chinese remainder theorem, conclude that the equation 2y? = 
x* —17 has solutions modulo N for every N. 


6.24. Exercise. In this exercise, let e(%) := exp(2mix) and notice that 
for x € Fy, then the expression e(x/p) is well-defined. Let p be odd and let 
Qi(x) = ayx? +++» +anx2, and Qo(x) = bir? +--+ +bnx? be two quadratic 
forms with coefficients in F,. Assume that n is odd and that the following 
condition is fulfilled. 


Forl<i<j <n, we have ajb; — ajbj # 0. (*) 
We will calculate N := card{x € FY | Qi(x) = Qo(zx) = O}. 
b 
a) Prove that dia,beF, Ss eenee ( ai (2) - Q2(2) ) = p?N, and deduce 


from this the following formula: 


Naptep? Sy e( eet). 


(a,b)4(0,0) eeF" 


where the sum is over nonzero pairs (a,b) € le 


b) Let T := Diner, e(x?/p) and let Q(x) = cya} +--+ +cCn2?2. Recall the 
2eFn e(Q(ax)/p) in terms of the c; and of the Gauss 


sum T, whenever cy --+ Cn Z 0. Deduce that if cy---Cy_1 # 0 but c, = 0, 


then . 
D> e(Q(2)/p) = (A *) op, 


very 


formula which gives 2. 


where (=) designates the Legendre symbol. Also, recall what the value of 


7? is. 


Pp 
Show that if (a,b) is not proportional to one of the (b;,—a;), then 


S* T(Aa, Xb) = 0. 


\EFs 


b 
c) To lighten the notation, we let T(a, b) := cern e ( a@i(z) + bQa(z) }. 


Calculate this last sum for (a,b) = (b;, —a;). 
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d) Let 


Di 
D; = Il (dja; = axb;) and a= ( D ) . 


1<j<njHi 


From the preceding arguments, deduce the formula 
(n-1)/2 f ™ 
N =p" +(p-1) (+4) (So5)> ae. 


6.25. Exercise. (Proof of the Davenport-Hasse formula, Theorem 1-5.15) 
Let x be a (nontrivial) character of Fj and let f € F,|X] be monic of degree 


n, i.e, f(X) =X" —-ayX"!4---+(-1)"an. We set A(f) = o(a1)x(an).- 
Show that » is multiplicative, i.e., that (fg) = A(f)A(g). 
Prove that if N, Tr are the norm and trace of Fgm to Fy, then 
G(XON,PoTr)= SY) deg(fyacfy/ ss, 
f,deg(f) |m 


where the sum is over the monic irreducible polynomials f in Fg|X]| whose 
degree divides m. 


Prove the identity 
1+ G(x o> ATI) = TT] (1 = Ag) Teee(s) ) 7 
9 


where the sum is over the monic polynomials and the product over the 
irreducible monic polynomials in F4[X]. 


By taking the logarithmic derivative, deduce the Davenport-Hasse relation. 


6.26. Exercise. Prove that for every N, the equation 
3x7 + 427 + 527 =0 


has primitive solutions modulo N (i.e., such that gcd(a, y, z,N) = 1). Same 
question for 5x° + 22y? + 223 = 0. 


Chapter 2 


Applications: Algorithms, 
Primality and Factorization, 
Codes 


“Elle est retrouvée. 
Quoi ? - L’Eternité. 
C’est la mer allée 
Avec le soleil.” 


ARTHUR RIMBAUD 


This chapter describes some industrial applications of number theory, via 
computer science. We succinctly describe the main algorithms as well as 
their theoretical complexity or computation time. We use the notation 
O(f(n)) to denote a function < Cf(n); furthermore, the unimportant—at 
least from a theoretical point of view—constants which appear will be ig- 
nored. In the following sections, we introduce the basics of cryptography 
and of the “RSA” system, which motivates the study of primality tests and 
factorization methods. We finish the chapter with an introduction to error- 
correcting codes, which will lead us into the study of cyclotomic polynomials. 


1. Basic Algorithms 


Let n be an integer. Once we have chosen a base b > 2, we write n in base 
b, in other words, with the digits a; € [0,b — 1]: 


b 
n=ao+a,b+---+a,b" =G,-G,_-{...@a , where a, 4 0 


(the two most standard base choices are b = 10 for usual decimal notation 
and b = 2 for binary notation, which is especially well-adapted to computer 
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programming). We will consider an operation on the digits to be a single 
operation (or an operation which needs O(1) computation time). It is 
natural to refer to the number of digits necessary in order to describe n, 
in other words r + 1, as its complexity. Since we can see that 6” < a,b" < 
n < b"*!, we know that 


logn 
a log b 


and can therefore describe the complexity as proportional to logn. It is 
clear that the manipulation of random numbers of size n requires at least 
log n elementary operations. We consider, as much from a practical point of 
view as from a theoretical one, an algorithm to be “good” if it is a polynomial 
algorithm; that is to say, it uses O ((logn)”) elementary operations. Con- 
versely, we consider an exponential algorithm, meaning that its execution 
time or required number of operations is greater than exp(« log n) = n", to 
be infeasible (for large n, of course). 


<r+l 


Addition. In order to add two numbers m and n with at most r digits, we 
must perform at most r additions of two digits and (possibly) carry a digit. 
The cost is therefore O (log max(n,m)) = O(r). The number of operations 
used in subtraction is similar. 


Multiplication. In order to calculate n x m, where n and m are two 
numbers with at most r digits (with the usual elementary school algorithm), 
we must perform at most r? elementary multiplications and r additions, 
and possibly carry a digit, and therefore, the cost is O ((log max(n,m))?) = 
O (ae 

Remark. The addition algorithm is (up to constants) optimal, but some 
more sophisticated methods (notably the “fast Fourier transform”) lets us 
perform multiplications at a much better cost, for example in O (r(logr)?). 
See Exercises 2-7.3 and 2-7.4. 


Division algorithm. Given a and b > 1, if we compute (q,7) such that 
a=qb+rand0<r< b—1 with (a variation of) the algorithm learned in 
elementary school, we perform a number of elementary operations similar to 
that of multiplication, i.e., O(log max(a, b)?). In order to give an example of 
a turtle algorithm (do not use!), we could perform the following procedure. 
We start by setting gq = 0 and rp = a. Then we have a = qob+79; if ro < 8, 
we stop, and if not, we compute q; = qq +1 and r; = rp — b in such a way 
that a = q;b+ 11, and we get the result by iteration and by stopping when 
mm < band a=qnb+rn. If a > b, we must perform approximately a/b 
subtractions, therefore the cost is O((log a) x (a/b)) (which is exponential). 


Euclidean algorithm. Given two integers, a and b, the goal is to compute 
d := gcd(a,b) and (u,v) € Z? such au + bv = d (Bézout’s lemma). The 
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principle is the following: we divide a by 6b, a = bq; + 71; then divide b 
by ri, b = ryg2 + rg, and in subsequent steps divide r, by rnii, Tn = 
Tn+19n+2 +Tn+2- Keep in mind that the sequence r,, is strictly decreasing 
and stops when ry+41 = 0, and therefore gcd(a, b) = ry. In fact, 


gcd(a, b) = ged(b,r1) = ged(ri,r2) = --- = ged(rn, Tn41) = Tn- 


In order to compute (u,v), we could proceed as follows: we set uo = 1, 
u, = 0, vo = 0 and v; = 1 and then recursively define uy = un—2 — dnUn—1 
and Up, = Un—2 — dnUn—1. One can immediately check by induction that 
AUntbun = Tn. We will now estimate the maximal number of times we need 
to use the division algorithm. We can assume that ro = a > r; = b and see 
that rn = Tn4idnt2 + Tnt2 STntittny2- If ro > 11 > +++ > Tn = dis the 
sequence which gives the gcd, set dj = rn_;. We then have dji2 > di4i1+dj. 
Let a := (1+ V5)/2 be the positive root of X? = X +1; it follows that 
d; > a’. This is true because dg = d > 1 = a®, dy > dg +1>2 >a! and if 
the inequality is true until i+1, we have dj,2 > dj41+d; > a*tt+a" = a't?. 
From this we conclude that a = d, > a", and the number of steps is 
bounded above by log(a)/log(a) = O(loga). We should point out that 
this argument implies that the longest computation happens when a and 
b are terms in Fibonacci sequence (see Exercise 2-7.5). The total cost is 
therefore O (log max{|a|, |b|}%). 


Computations in Z/NZ. The goal is to perform addition and multipli- 
cation of two integers smaller than N, then to take the remainder gotten 
from dividing by N in the division algorithm. In order to calculate the 
inverse of a modulo N, we proceed as follows: if a is an integer, the Eu- 
clidean algorithm tells us that either ged(a, N) > 1—in which case a is not 
invertible modulo N—or there exist u,v (gotten from the algorithm) such 
that au + Nv = 1 and therefore the inverse of a is the class of u modulo 
N. The cost is therefore the same as that of the Euclidean algorithm. 


Exponentiation. In order to calculate a™, we could of course calculate 
axax---xXa, but this will force us to perform m — 1 multiplications; we 
could do a lot better by performing the computation in O(log m) multipli- 
cations. For example, if m = 2" we would carry out r multiplications. In 
the general case, we write m in binary notation m = €9 + 642 +--+ + €,2" 
and we would calculate 


Or we could do the calculation in the other direction; the algorithm can be 
defined iteratively. In order to do this, we start with the initial data chosen 
to be (u,v, 7) := (1,a,m) and we iterate as follows: if n is even, we replace 
(u,v,n) by (u, v?,n/2) and if is n odd, we replace (u,v,n) by (uv, v7, (n — 
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1)/2); we stop when n = 0, and we therefore have u = a™. Since n is at 
least divisible by 2 in each step, the number of steps r satisfies 2” < m, and 
hence we must perform O(log m) multiplications. If we calculate mod N, 
we reduce each result mod N, and so in each step we multiply integers < N. 
The total cost to compute a™ mod N is therefore O (log m(log N)”). 


Computations in F, and Fj. We will assume that the finite field 
F, = F,; is defined by an irreducible monic polynomial S(X) = XJ + 
sp-1X/-1+.-.-+ 89 € F,[X] of degree f. We therefore identify F, with 
F,,[X]|/SF,[X], which can be seen as the vector space over F, with basis 
1,v,x?,...,2/—1! with addition on the individual coordinates and multi- 
plication defined by x - a7 = xt) and af = —sy_ja/~1 —---— 89. An 
element of F, is therefore seen as an f-tuple of integers modulo p or as a 
polynomial of degree < f —1. To perform an addition, we must perform 
f additions in F,, so at a cost of O(flogp) = O(loggq). To carry out a 
multiplication, we take the product of two polynomials, or essentially f? 
multiplications in F,, then divide the result by S(X) using the division 
algorithm, or essentially O(f) divisions and O(f?) multiplications in F,. 
The cost of a multiplication in F, is therefore O( f? (log p)*) + O(f (log p)*). 
Let us point out that this cost is still O((log q)*), but that if we choose 
q = 2/ for example, it is O(f?) = O((log q)”). 


2. Cryptography, RSA 


We are only interested here in one aspect of cryptography and in one system 
of “public keys”, known as RSA from the name of its three inventors, Rivest, 
Shamir and Adleman [61], and which is one of the most widely used. 


Cryptography is the art (or science) of secret messages: we want to send 
information so that only one other person, the recipient, can see it. A 
related problem is to be able to identify with certainty the sender of the 
message. We generally think that the only method is to use a “secret code”; 
in fact the originality of “public key” cryptography comes precisely from 
the fact that the code is not secret, but is known (for the most part) by 
everybody! This is not only a mathematical curiosity, it is also the principle 
governing credit cards, internet transactions, etc. 


The general principle is the following. We call -@ the set of messages (in 
practice we take .@ = [0,N — 1] or Z/NZ). Two people, A and B, who 
wish to exchange messages in such a way that a third person, C, cannot 
decipher them each choose bijections fa, fp: &@ — &. The set @ (say 
the integer NV) is known to everybody, as well as fa and fg, however—and 
this is the key idea—the inverse function i (resp. fp ') is only known 
by A (resp. by B). This does not mean of course that, knowing fa, it is 
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theoretically impossible to compute me but this calculation would be so 
long, that it would be out of the question to carry out in a reasonable time 
frame. We will later see how to construct such functions. 


When A wants to send B a message m € -@ (say an integer modulo NV), he 
or she simply sends m’ = fp o fx ‘(m); remember that A knows fp (which 
is public) and f+ (which only he or she knows). In order to decode this 
message, B computes f,4 o fg '(m’), which will give m; remember that 
B knows f, (which is public) and fg! (which only he or she knows). 
The system has two advantages: not only can C' not decipher the message 
without computing fp ' (which we assume to be out of the question), but 
B can be sure that it is A who sent the message since it must have been 
encoded using re which only A knows! 


This procedure is a simplified form of the known methods under the name 
of the Diffie-Hellman protocol (1976); its security relies on the choice of 
the “one-way” functions f, in other words such that f is quick and easy to 
compute, but f~! is in practice impossible to determine. Many construc- 
tions of functions have been suggested, but one of the most hardy and most 
widely used, relies on the fact that if p and q are very large prime numbers 
(say 100 or more digits), then their product N := pq can be calculated very 
quickly (say 10,000 elementary operations), whereas if you only know N, 
it is an extremely long calculation to factor it, impossible in practice. 


We now construct the functions f4 of the RSA system. We choose two 
very large prime numbers, p and gq, compute N := pq and also choose a 
medium-sized integer d which is relatively prime to ¢(N) = (p— 1)(q—- 1). 
The public key is therefore (V,d); however, p and q are secret and we set, 
for a any integer smaller than N, 


f(a) :=a* mod N. 


To decode a message, we calculate the inverse e of d modulo ¢(.N) and we 
observe that 
f—*(b) = b& mod N, 


since (a¢)° = a®¢ = amod N, because a®(%) = 1 mod N. 


2.1. Remarks. 1) There is one little constraint on the “message” a: it 
should be relatively prime to N'. Nonetheless, observe that the proportion 
of integers which are relatively prime to N is ¢(N)/N = (1—1/p)(1—1/q); 
so if p,q are for example > 10°°, the proportion of integers which are not 
relatively prime to N is < 2-107°°. 


llf by mistake, a message a = pa’ was sent, we could certainly still decode it by 
f(a)? = ptegiet = p*4a' = a, but C, or whoever else, would only have to compute 
gcd(a, N) to discover p and crack the code! 
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2) Once p, q and d have been chosen, the computation of N, ¢(N) and e 
is performed in polynomial time (fast); likewise, the operation a+> f(a) is 
just as fast as at f—1(a) if we know e. 

3) We can see, at least heuristically, that knowing the number e allows 
us to factor N: if we write de — 1 = 2"M (with M odd), by computing 
gcd(a?’™ +1,N) for j = 1,2,... and some values of a, we have a good 
chance of quickly factoring N. 


4) Therefore, if someone knows only the public key (N,d), they should a 
priori factor N in order to compute ¢(V) then e. In fact, the knowledge 
of ¢(N) is equivalent to that of p and q, because ¢(N) = N —(p+q)+1 
(the knowledge of the product and the sum of two integers lets you easily 
determine the integer pair). 


This system gives rise to many problems, the solutions to which are more 
or less satisfactory. 


i) How do you construct (very) large prime numbers? 
ii) What methods do we have for factoring an integer? 


iii) How should you choose p and q in RSA that resist factorization meth- 
ods? 


Since it is clear from question iiz) that the prime numbers should not be 
too “special”, question 7) is essentially equivalent to the following problem. 


e (I) (Primality Test) Give a fast algorithm which determines whether a 
number JN is prime. 


If we had access to such an algorithm #, we could in fact decide on the 
size of the integer (for example N ~ 10°°), randomly choose an odd integer 
N, of this size, and test A(N,) then Y(N, +2), A(Ni +4) until we find a 
prime number. By the theorems on the distribution of prime numbers, the 
number of primes in an interval [N;, Ni + H] is approximately H/log(N1); 
so we expect to find a prime number in O(log(N1)) tries. 


We will see that satisfactory answers to problem 7) are available, but we 
only know partial answers to the other questions. 


3. Primality Test (1) 


We consider an odd integer N and the problem of determining whether 
N is prime. We denote by (M,N) the ged of M and N. The letter p is 
reserved for a number which we already know is prime. The first of all of 
the primality tests, and in some sense the “grandfather”, is the following 
lemma. 
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3.1. Lemma. (Fermat) If N is prime and (a,N) = 1, then aN-1 = 
lmodN. 


Proof. The group Z/NZ* has order N — 1 and the lemma follows from the 
Lagrange’s theorem.’ 


This is a “good” test, in the sense that computing a’~!mod N requires 
O(log N) multiplications (under the condition of course that you use the 
binary notation for N — 1). However, it is also a “bad” test, because there 
are numbers, called Carmichael numbers, which satisfy the test without 
being prime. We even know that there are infinitely many of them [11], 
the smallest being 561 = 3-11-17. We can easily see that a number N 
is a Carmichael number if and only if N is square-free and p — 1 divides 
N—1 for every p which divides N. In general, we could introduce X(N), the 
exponent of the group (Z/NZ)*, sometimes called the Carmichael function: 
it is the smallest positive integer (in the sense of divisibility or the usual 
order) such that for all a relatively prime to N, aX) = 1mod N. By what 
we have seen, we know that if N = pj"'--+p;’* is odd, we have 


MN) = lem (p["~*(p1 — 1),..-, Pe* "(pe — 1)) (2.1) 


It is always true that A(NV) divides (NV) and the equality holds if and only 
if (Z/NZ)* is cyclic, i.e., if N = p* or 2p® or 4. 


3.2. Lemma. (Euler?) If N is prime and (a, N) =1, then 


N-1 
a 2 =(£) mod N. 


Proof. This is simply a restatement of assertion ii) from Theorem 1-3.3. 


The Solovay-Strassen test is an algorithm which checks the congruences 
given below for a randomly chosen a. This test is always polynomial (for 
any value of a, we can always quickly calculate the Jacobi symbol thanks 
to the quadratic reciprocity law, see Exercise 2-7.7) and is better than 
Fermat’s test. 


N-1 
3.3. Lemma. Let H := {« €(Z/nZ)*|a 2 = (+) moa}, then 
H = (Z/nZ)* if and only if N is a prime number. 


2To prove Fermat’s little theorem by using Lagrange’s theorem is obviously an 
anachronism. 

3Calling a statement which uses the Legendre or Jacobi symbol “Euler’s criterion” is 
also an anachronism. 
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Proof. We have seen that if N is prime, then H = (Z/nZ)*. If p? divides 
N, there exists a of order p(p—1), and p does not divide N —1. Therefore, 
aX! 41. If N = ppo---p, with r > 2, choose (by the Chinese remainder 
theorem) a = 1 modulo po,...,p, and which is not a square modulo p; 
hence (+) = —1, but a%—)/? = 1modp,---p, and thus aN-V/? # 
—lmodN. 


Applications. 


i) Probabilistic polynomial test. If N is composite then (Z/NZ* : H) > 2 
and hence by randomly choosing a, we have at least a one in two chance 
that a ¢ H. Hence if N successively passes k tests, we can say that it 
is prime with a probability greater than 1 — 2-*. 

ii) Deterministic polynomial test (assuming GRH). Analytic theory has 
provided a proof that if the Dirichlet L(y, s) functions do not vanish 
on Re(s) > 1/2 (generalized Riemann hypothesis, GRH), then for every 
nontrivial character y : (Z/NZ)* — C*, there exists an a < 2(log N)? 
such that x(a) 4 0,1. We can deduce from this that if N were compos- 


ite, there would exist a < 2(log N)? which would not pass the Solovay- 

N-1 
Strassen test. If N = py"! ---p,’*, we introduce f(a) :=a 2 (+) 
and 


xi: (Z/NZ)* & (Z/NZ)* > (Z/p™Z)* > Cr. 


We see that H is the intersection of the kernels of y;. By trying all 
of the a € [2,2(log N)?], we therefore get a primality certificate (i.e., a 
proof of primality), under the condition that the Riemann hypothesis 
is true. 


We could improve the Solovay-Strassen test and algorithm. 


3.4. Lemma. (Rabin-Miller) Let N be odd. Set N—1 = 2°M, with M 
odd. If N is prime and (a,N) = 1, then either a@M = 1modN or there 
exists O< r<s—1 such that a?“ = —1modN. 


Proof. The order of a modulo N is 2M’, where 0 < t < s and M’ is an 


odd integer which divides M. If t = 0, then aM’ = 1 hence a” = 1. If 
gt-1 yy! gt-l yy — 1 


t > 1, then, since N is prime, a = —1, and therefore a 
This test is better than Euler’s test, because, for one thing, if the pair a, 
N passes the Rabin-Miller test, then it also must pass Euler’s test. Fur- 
thermore, if N is composite, the proportion of a which pass the refined test 
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is < 1/4 and often smaller than that. Of course there exists a probabilis- 
tic polynomial version of the refined test and a deterministic polynomial 
version, assuming that the Riemann hypothesis is true. 


3.5. Remark. If N = 3mod4, then “Rabin-Miller” is identical to 
“Solovay-Strassen”, and even equivalent to a(N—))/2 = +1 mod N. We know 


that (N — 1)/2 is odd, and we can observe that if ¢ = +1, then (<) =€, 
and if a\V-1)/2 = +1 mod N, then 


(a2) N-3)/4 i 
= (SEI) = (2) <a om 


Proof. (“Rabin-Miller” > “Solovay-Strassen”, in the general case) Now, we 

know that a(V-)/? = @?°"'™ equals —1 mod N if r = s — 1 and equals 

1 mod N in all of the other cases. Therefore, we need to compute (+). 
M M WEE 

If a’ = 1 mod N, then (+) = (+) = () =1l,hencea 2 = 

(+) od N. Now assume that a2°” = —1modN. Let p; divide N and 


write p; —1 = 2% M;. Then, since a2” ” = —1 mod p;, the order of a modulo 
p; is of the form 2"t'L; (with L; odd). Therefore, modulo p;, we get 


(+) = g-D/2 = gM = 1 ifs; >rt+l, 
Di -1 ifs,;=r4+1. 


Now notice that r+ 1 < s;. Let h be the number of indices i such that 
8s; = r+1. Therefore, we have (+) = (-1)". Modulo 2"*?, we have 


d 


N = 1+28M = J],p; = [],(1 + 2%) = 1+ h2"** mod2"*?. In the 


case where r < s—1, h must be even, so that (+) = 1, and we get 


a\N—1)/2 = 1mod N. In the case where r = s — 1, then h is odd and 


(+) = —1 =a-)/? mod N. 


We can summarize the previous discussion by introducing the following 
sets: 


Go := (Z/NZ)*, 
Gy := {a € (Z/NZ)* | aN-! =1mod N}, 
Go := {a € (Z/NZ)* | aN—)/? = +1 mod N}, 


G3:= {a €(Z/NZ)* |aX-D?P = (+) mod N}, 
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S:= {a € (Z/NZ)* | a™ = 1mod N or Gr € [0,s — 1] such that 


go eS —1mod N}. 


We always have the inclusions S C G3 C G2 C G, C Go, and these 
are equalities if and only if N is prime, or also if and only if G3 = Go. 
Furthermore, G,, G2 and G3 are subgroups, but in general S is not, even 
though in the case N = 3mod4 we have seen that Gz = G3 = S. In fact, S 
is stable under inversion, and if a,b € S do not satisfy the same congruence 
or both a@ = b” = 1, then ab € S. But if a2 ™ = 5?" = -1, it could 
happen that ab ¢ 9. For example, if e? = 1 but « 4 +1 and if a?” = 
—1 (which would force N = 1mod4), then a € S and ae € S, because 
(ae)?™@ = —1. However, (ca”)” = «@q?”@ = —e # +1 and (ea”)?™ = 1, 
hence ea? ¢ S. By considering a +> (+) a\N-1)/2 from Gg to {+1}, we 


see that (G2 : G3) = 1 or 2. We are now going to compute the cardinality 
of the set S' and, in particular, verify the following statement. 


3.6. Proposition. Let N be an odd, composite number. If N £9, then 
|S] 1 
< 


Go| ~ 4 


3.7. Definition. Let A, B be integers. We define 
(A; B) = card {a € (Z/AZ)* | a? =1mod A}. 
3.8. Lemma. Let t > 0 and N = 14 2°M = pf'---pp* (with M odd). 
We set pj — 1 = 2°'M;, s, = min(t, s;) and t; := gcd(M, M;). Then 
O(N, 2*M) = 28 t7t8k ty. ty. 
Moreover, the cardinality of the set 
{a € (Z/NZ)* | ous —1mod N} 
is 0 if t > min; s;, and equal to 6(N,2*M) = 2**t,---t, if t < min; s;. 


Proof. We know that a2" = 1 mod N if and only if a2“ = 1 mod p;’ for 
j=l,...,k. Now, the group (Z/p;" Z)* is cyclic of order (p; — 1)pi7 
the number of solutions is 


gcd(2'M, (p; — 1)p5?~*) = ged(2'M, 2° Mj) = 2s) 4, 


, SO 


By the Chinese remainder theorem, the number of solutions modulo N 
is therefore the product of these numbers, and hence we have proven the 
first claim. For the second claim, we see right away that either there does 
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not exist any solution, or there does exist a solution and therefore the set 
of solutions is in bijection with the solutions of the previous congruence. 


The congruence a2“ = —1 mod Pp? is solvable if and only if 2'+! divides 


(p; — l)p;?*, in other words if and only if t+ 1 < s;, hence we have the 
desired result. 


Proof. (of Proposition 2-3.6) Assume that s; < sg < ... < sp. By 
decomposing the set S$ into Sp := {a € (Z/NZ)* |a“ =1modN} and 


= {a € (Z/NZ)* | a? M = —1mod N} for 0 < j < s; — 1 and by ap- 


plying Lemma 2-3.8 to each one of these sets, we have 


ks, ki 
card($)=t,- +t, (1 Sef OR get ae) =t1---tr (22), 
The ratio of a € Go which pass the Rabin-Miller test is therefore 
card(S) _ ty sree tr g—(sit-+sx) ( gks1 + gk = 2 ) (2 2) 
card(Go) M +++ My p®-?.. port ok 4 . . 
If k = 1, the ratio is equal to a 7 I Te and is therefore < = 
Mipy ptt 5 
except when N = 3? in which case we have |$|/|Go| = 1/3. If k > 2, we 
can assume that ay = --: = ax = 1, if not, the ratio is < 1/p;, which in 


practice we can assume to be arbitrarily small. If one of the M; is different 
from t;, then t,...t,/M,...M, < 1/3. Furthermore, 


soy | 2 Oe? < poke iiss ae < Qik 
k k k 4 
2° —1 2° —1 2° —1 


so the ratio is < 1/8 ifk >4 and <1/4ifk =3. 
If k = 2 and if one of the M; is distinct from all of the t;, then the ratio is 
< 1/6. If k = 2 and M, — ty (Le., M, divides M) and My = to (ie., My 
divides /), we see that M, = Mp, hence s1 < 5s (if not p; = po). We then 
< 981-83 1 aie 91-251 < 1 a gl—2s1 < 1 
3 6 4 


have that the ratio is 


3.9. Remark. By looking at the upper bounds above, we can prove that 
the two “worst” cases are the following. 


i) The number N is equal to pg with gq = 2p—1 and p = 3mod4. For 
example, N = 3-5, N = 7-18, etc. It follows that p = 14+ 2M, and 
q=14+4M, and N = (14+ 2M;)(1+4M;) = 14+2M\(3+4M;), hence 
ty = to = M, = M2 and so 

card(S) 4 


card(Go) 4 
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ii) The number NV is equal to pgr = 1+ 2M, where p= 1+2Mi, gq = 
14+ 2Ms, r=1+2M3 and M; divides M. It follows from this that 
the ratio is also 1/4. Take for example: N = 8911 = 7-19-67 (where 
M, = 3, My = 9, M3 = 33 and M = 4455 = 34-5-11). 


4. Primality Test (II) 


In this section we present the Agrawal-Kayal-Sazena algorithm [10], which 
dates back to July 2002, and was introduced in their article “PRIMES is in 
P”. It gives a primality test in polynomial time. 


The original idea was to perform tests in Z[X]. For example, we easily see 
that if N is prime, then (X — a)" = XX —amodN, but this test has the 
major default of requiring the computation of N coefficients. That will just 
not do! 


4.1. Lemma. Let N be prime and h(X) € Z[X] a polynomial of degree r. 
Then 
(X —a)N = XN —a mod(N,h(X)). 


Recall that in a ring, the notation a = bmodJ means that a— 6 belongs to 
the ideal J and that (a1,...,@m) is the notation used for the ideal generated 
by @1,..-,@m. Thus the congruence in the lemma can be restated as: there 
exists P,Q € Z[X] such that (X —a)% — (XN —a) = NP(X)+A(X)Q(X). 
It should be noted that if r is O((log N)*), then this test remains poly- 
nomial. The problem is to choose pairs a, h(X) in such a way that they 
detect non-primality. The solution proposed by Agrawal, Kayal and Sax- 
ena is to choose h(X) = X" — 1 with r being a “very well-chosen” prime, 
in particular r = O((log N)*), and to prove that it is then sufficient to test 
the a € [1, L] with L = O(,/rlog N) in order to ensure that N is prime, or 
possibly a prime power, which is not so bad. 

The argument is essentially algebraic and combinatorial, but nevertheless 
uses a result on the distribution of prime numbers, in fact a weak form 
of the prime number theorem (see Chap. IV, (4.10)), which says that the 
sum of the log p for p prime and smaller than x is > c,x for some constant 
ci > 0. We summarize what we are going to use in a lemma. 


4.2. Lemma. Let Y > 1 and let N > 2 be an integer. There exists a 
prime number r which satisfies the following two properties. 

i) The order of N modulo r is at least Y. 

ii) Furthermore, r = O (Y? log N). 
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Proof. Set A:= [[icycy(N¥ — 1). Let r be the smallest prime number 
which does not divide A. Then for y < Y, we have N¥ ~ 1 modr, and hence 
condition i). Moreover, every p < r divides A, whereas A < NY(¥+)/? 
and consequently 


Vey aly 


5 log N. 


ar< S “log p <logA< 
p<r 


From this we have that r= O (Y? log N). 


Remark. We could add that, since the order of N modulo r divides r — 1, 
we necessarily have r > Y. 


We will also use the following elementary combinatorial lemma. 


4.3. Lemma. The cardinality of the set of monomials in L variables of 
degree < k is 


L+k 
cad (ay. ma) [ms 2 0 and may +--+ mg, <b} = ( + ). 


k 


Furthermore, we have the estimate 


@ a _ S gmin(L.k) 


Proof. The first formula is classical and can be proven, for example, by 
induction (call the cardinality in question f(L,k), check that f(L,0) = 1 
and f(1,k) = k+1, and then prove that f(L,k) = f(L,k—1)+f(L—1,k)). 
For the lower bound, observe that if k < L, then 


L+k\ (L+k)(L+k—-1)---(L+2)(L+1) a ig 
( k; ) k(k—1)---2-1 II ( a, je 


and if L > k, reverse the roles of L and k. 


Remark. We can often improve this inequality; for example, if1<k< L, 
then (“{*) > 2*(L + 1)/2, and thus if L > 5, we have (“{*) > 2*+1. 


We will now state a version of the main theorem of Agrawal-Kayal-Saxena. 


4.4. Theorem. Let N > 2 and let r be a prime number satisfying: 


i) no prime number < r divides N; 
ii) we have ord(N modr) > (2log N/ log 2)? + 1; 
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iwi) forl<a<r-—1, we have 
(X =a)" =X” —@ mod WV, X7* = 1). 


Then N is a prime power. 


Remarks. In order to prove this theorem, we only assume that hypothesis 
iit) is satisfied for 1 < a < L and we will see that we can take L smaller 
than r—1. By Lemma 2-4.2, we can choose r = O ((log N)°) such that ii) 
is satisfied, and it would necessarily follow that r > (2log N/log 2)? + 1. 
Thus it is clear that the theorem implies that the following algorithm is 
correct and polynomial. 


ALGORITHM. [10] We put in N and the algorithm returns “Prime” or 
“Composite”. 


1) We check to see if N = a? where b > 2; if so, then N is “Composite”. 

2) We try the prime numbers r = 2,3.... If r divides N, N is “Com- 
posite”. If not, we check whether r is relatively prime NY¥ — 1 for 
y = 1,2,...,Y, where Y = |(2log N/log2)?| +1; if so we keep r and 
go to the next step, if not we look for a larger r. 

3) For a = 1,2,3,4,... (stop at r—1), we check whether (X — a)" # 
XN —a mod(N,X" —1). If so, then N is “Composite”, if not, we 
proceed to a+ 1. 

4) If the algorithm keeps going until a = r — 1, then N is “Prime”. 


Let us briefly discuss its complexity (without trying to optimize it). We 
easily see that the longest step is step (3), which requires O(r log N) mul- 
tiplications in the ring Z[|X]/(N,X" —1), where each one uses at most 
O((rlog N)?) elementary operations. We thus have O((rlog N)?) in all. 
If we add that r = O ((log N)°), we obtain a complexity of at most 
O ((log.N)}8). 

We now proceed to the proof of the theorem. Let p be a prime divisor of N. 
We denote by d, := ord(N modr), dz = ord(p modr) and d := lem(dj, dz). 
It should be noted that d (resp. dz) is the order of the subgroup generated 
by N (resp. by p) in (Z/rZ)* and that d is therefore the order of the 
subgroup generated by N and p in (Z/rZ)*. We then choose h(X) to be 
an irreducible factor of ®,(X) := (X" —1)/(X — 1) in F,[X]. Let us point 
out, even if we do not need it, that deg(h) = dz (see Theorem 2-6.2.8). We 
will work in the field K := F,|X]/(h(X)), which is a finite field (isomorphic 
to F 22 ) and which we obtain by adding a primitive rth root of unity to 
F,. By construction, x := X modh(X) is of order r in A*. It is natural 
to look at the subgroup G of K* generated by the classes of (X — a) for 
1<a<UL. The heart of the proof consists of finding an upper and lower 
bound for the order of G. 
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4.5. Lemma. We have the lower bound 


L+d-1 , 
cara(G) > ( se ) > amine, 


From the remark immediately following the combinatorial lemma (Lemma 
2-4.3), we have for example that if 1 <d—1< L, then card(G) > 2%, and 
if L < d, then card(G) > 2#+?. 


Proof. In light of the combinatorial lemma mentioned above, it suffices to 
show that the classes of elements, 


L 
II (X—a)™*, form, >0 and Soma <d-1l, 
a=1 


1<a<L 


are all distinct in K. First of all, the a are distinct modulo p, because if 
not, then p < L <r and we assumed that N was not divisible by any prime 
number smaller than r, so p > r. Thus our polynomials are all distinct in 
F,[X]. Now we bring in the key point that if P = [],<,¢,(X — a)™, 
then we have, on one hand, P(X)“ = P(X%)mod(N, X" — 1), but also 
P(X)? = P(X”) mod p, so the two congruences are valid mod(p, X” — 1). 
For m = N‘p’, it therefore follows that 


P(X)” = P(X™) mod(p,X"—1) or even mod(p,h(X)). 


In fact, the set of m such that P(X)™ = P(X™) mod(p, X" — 1) is multi- 
plicative (the fairly simple proof is given in detail in part ii) of 2-4.7 below). 
Now let P and Q be two polynomials of the form given above (considered 
in F,[X]), and suppose that they are in the same class in K, i.e., suppose 
P = Qmod(p,h(X)). Let x be the class of X, which is an rth primitive 
root of unity in AK, and therefore 


(P — Q)(a™) =0, for mé€(N,p) C (Z/rZ)*. 


But we know that N and p generate a subgroup of order d in (Z/rZ)*, thus 
the polynomial P — Q has at least d roots, and since deg(P — Q) < d—1, 
we see that P = Q (first in F,[X], then, if we want, in Z[X]). 


In order to find an upper bound for |G|, we choose a generator of G (it is 
a subgroup of K* and is thus cyclic) and define the following set. 


4.6. Definition. Let g be a generator of G. We define 
F = I,:={meEN | o(X)™ = g(X™) mod(X” —1,p)}. 


The main properties of 4% are summarized in the following lemma. 
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4.7. Lemma. The set Y satisfies the following properties. 


i) N and p are in %. 
ii) F is multiplicative, t.e., if my, and m2 € F, then mime € F. 
iii) Ifm, andmz € F satisfym, = m2 modr, thenm, = mz modcard(G). 


Proof. The first property has already been established. For iz), write 
g(x)m™im = (GX ym = (gixm™)y™ mod(p, x’ _ 1), 


and notice that since mz € 4%, we have g(Y)"? = g(Y™?) mod(p, Y" — 1). 
Therefore, by substituting Y = X™', we obtain 


(9 X™))™ = g(K™™2) + DQA(X™) + (X™" — 1) Q(X) 
= g(X"™") mod(p, X" — 1). 


In order to prove iit), suppose that m, and m2 € 4%, and that mz = m+kr, 
where k > 0. It follows from this that 


g(X)" = g(X™?) mod(X” — 1,p) and thus mod(h(X), p); 
hence g(X)™+kr = g(X™+r) in K. But X™+*r = _X™ mod(X? — 1) 
and therefore mod(h(X)). Thus we obtain the equality in K* 

9 X)™ G(X)" = g(X™) = g(X)™, 


where the last equality comes from the hypothesis that m; € 4%. From 
this, we of course have that g(X)*" = 1 € K* and hence card(G) divides 
kr = m2 — my. 


Proof. (end of the proof of Theorem 2-4.4) In order to apply the lemma, 
we use that N, p and hence all of the products of powers N‘p’ are in F. 
Recall that these elements generate a subgroup of order d in (Z/rZ)*. If 
we set 


E:={(,j)ENxN|0<i,j < Vd}, 


then the cardinality of E is (|Wd| +1)? > d. By the pigeonhole principle*, 
there are two elements N“ p’! and N‘2p)2, which are congruent modulo r, 
and such that (71,71) and (t2,j2) are distinct in E. These two elements 
Np?! and Np’? are therefore congruent modulo card(G). First suppose 
that N“p)! ~A N*2p/2, which implies that 


card(G) < |N@ pi — N®2p!2| < N2V4, 


If we combine this upper bound with the lower bound gotten above, we see 
that _ 
min(L + 1,d)log2 < (2Vd) log N. 


4The pigeonhole principle says that if we put n +1 pigeons into n boxes, at least one 
of the boxes will contain at least two pigeons. 
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We will prove that this inequality is impossible. 


1) If we had L > d, we could deduce that Vd < 2log N/log 2 or moreover 
that d < (2log N/log2)?. But this inequality is a contradiction since, 
by construction, d > d,, and we assumed that d, > (2log N/ log 2)?. 

2) Now if L < d, we deduce that (ZL + 1)log2 < (2Vd)log N, and since 
d<r-—1, this would give us (L + 1) log2 < 2\/r —llog N. 


It is therefore a sufficient condition that L > 2V/r — 1log N/log2 is large 
enough in order to conclude that N“p’t = N*p)2. The choice L = r—1 
is suitable® since then the desired equality would be equivalent to the 
inequality Vr —1 > 2log N/log2, which is where the hypothesis r > 
(2 log N/log 2)? + 1 comes from. We finish the proof by pointing out that 
the inequality N“p?! = N*2p/2 immediately implies that N = p®. 


4.8. Remark. One variation of this proof consists of abandoning the 
constraint that r is a prime number; we choose a factor, h(X), of ®, € 
F,,[X] where ®, is the rth cyclotomic polynomial (cf. Sect. 6 of this 
chapter), and we could then omit every analytic estimate of the distribution 
of prime numbers (see [33] for this version, as well as a finer estimate of 
the complexity). 


5. Factorization 


We briefly consider, and necessarily very unsatisfactorily, the problem of 
factorization: having established, by a primality test, that an integer N is 
not prime, how could we go about factoring it? We start by pointing out 
that the (complete) factorization problem is essentially equivalent to the 
problem of finding one factor, because of course, by iterating this procedure, 
we would achieve a complete factorization. 


The naive factorization method consists of checking if 2 divides N, then if 
3 divides N, etc. If N = pq where p and q are roughly of the same size, i.e., 
p~q~ VN, we see that we would need to perform O(N) divisions before 
arriving at a factorization of N. The naive algorithm is thus exponential. 


There do exist more efficient algorithms. In fact, one of the best algorithms 
known [49] (using elliptic curves) has a number of operations estimated by 
exp(C/log plog log p), where p is the smallest prime factor of N. In the 
case where N = pq where p ~ q ~ VN, we therefore get an algorithm 
with an order of complexity exp(C’(log N)") (where « < 1), which grows 


>We point out however that we could take L = O(./rlog N), which would allow us 
to slightly improve the estimate of the complexity. 
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less quickly than N“ but more quickly than (log NV)". We say that such an 
algorithm is subexponential. Another algorithm [19] (“number field sieve”) 
has a complexity on the order of exp (C(log N)'/3(log log N)?/*). In 2006, 
it was known in practice how to factor an integer with 100 digits in a couple 
of hours, and by using many computers over many months, how to factor 
an integer with 150 digits. But we still cannot factor, over the course of 
a human lifetime, an RSA number, with say 300 digits. A surprising fact 
is that the complexity of various algorithms (proven probabilistically or 
heuristically) tends to take the form of a function (see [48]): 


L(b, N) := exp (C(log N)’(loglog N)!~°) . 


The case b = 0, in other words (log N)°, corresponds to polynomial al- 
gorithms, the case b = 1, in other words N“@, corresponds to exponential 
algorithms and the cases 0 < b < 1 correspond to subexponential algo- 
rithms; the two algorithms cited above have a complexity estimated at 
L(1/2, N) and L(1/3, N). 

We are not going to present the most powerful algorithms right away, since 
they use tools which surpass the level of this chapter; the algorithms which 
use elliptic curves and the number field sieve are presented in Appendix A, 
which is about factorization. For the moment, we will settle for describing 
an algorithm which improves on the naive algorithm by providing an even 
more efficient one. 


From now on, we use the convention that the letter p is reserved for a factor 
of N. 


Pollard’s p algorithm. We proceed as follows. We choose ag between 1 
and N and we compute the sequence given by a;41 = f(a;), where f(a) := 
a? +1modN. We then choose k “big enough, but not too big” and we 
calculate gcd(a2x — ax, N), hoping that it is nontrivial; if that is the case, 
we have found a factorization, if not, we try again with larger k. We 
will explain below why, at least statistically, there exists k of size O(,/p), 
where p divides gcd(a2x — ax, N). Assuming that, we see that the average 
complexity of the algorithm is O(,/p), thus O( WN). 


The analysis of the complexity is based on the hypothesis that the sequence 
a; modulo p is sufficiently “random”, which has been satisfactorily confirmed 
in practice. Now, the probability that r numbers modulo p chosen “at 
random” are all distinct is® 


P= (1B) §)~-(0- 554) <on (2552), 


Pp 
If we take r on the order of \/p, say r 2 2,/p, the probability that two 


SExample. If n > 23, the probability that, among n people, two have the same 
birthday is greater than 1/2. 
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of the numbers are equal (modulo p) will be > 1/2, thus we have a good 
chance to have two indices i < 7 < r such that a; = a;modp. Considering 
the construction that follows, we would have a;4m = a;+m mod p for every 
m > 0, and in particular, by taking m = 7 — 2i and k = 7 — i, we would 
have ay = a2, mod p (see [22] for more details). 


“Difference of squares” algorithm. The second algorithm, that we will 
only sketch, is based on the fact that the number of elements a € (Z/NZ)* 
such that a? = 1 is at least equal to 4 if N has at least two distinct prime 
factors. If we knew how to compute a square root in (Z/NZ)*, say &(z), 
with a fast algorithm «&, then we could factor N like this: take a at random 
and calculate b = &(a”). Then we of course have that a? = b? mod N, or 
even that N divides (a + b)(a — 6). Now, there is (at least) a one in two 
chance that tamod WN is not the square root calculated by & and, in 
this case, the calculation of gcd(N,a + 6) or of gcd(.N,a — b) would give 
us a factorization. Unfortunately, or luckily, we do not know of any fast 
algorithm & (it is even possible that one does not exist). One extension 
of this idea is the following: instead of directly looking for an equality 
a” = b? mod N, we try to construct one. In order to do this, we randomly 
take a close to /N , we reduce a? modulo N (taking care to take the 
representative in [—N/2, N/2]) and we try to factor it with small prime 
numbers. In this way, we get a family of congruences a’ = ee gprs. 
We therefore look for a combination of these numbers which provides an 
equality of the type |], a7 = [], 6; mod N (this is a linear algebra problem 


over F2). This idea, presented very vaguely here, is expanded on in more 
detail in Appendix A, when we describe the number field sieve algorithm. 
Property quantified, this algorithm has an average (heuristic) complexity on 
the order of L(1/2, N)—which is already remarkable, even if it is insufficient 
for very large numbers. 


Examples of precautions to take when choosing p and q for the 
RSA method. We will only give some very elementary indications, since 
the question is fairly complex, and in fact largely open. 


1) The absolute value, |p — g|, must be large. We can see why by writing 
q = p+6 where 6 is much smaller than p. Since N = pq, then VN = 
pJf 1+6/p ~ p+ 6/2 and we could find p with the “naive” algorithm in 
O(6) steps! 

2) It must be that p—1 (resp. gq —1) are not too smooth, in other words, 
cannot be factored too quickly, for example the product of small prime 
numbers. To see why this is true, choose C' > 0, and let p,,...,px be the 
prime numbers smaller than C; the set S := {s = pj"! ---p,"* | s < N} has 
cardinality O((log N)*), and we can therefore calculate ged(a* — 1, N) for 
some values of a and s € S in polynomial time. If p—1 € S (in other words 
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if p— 1 only has prime factors < C), then we have a very good chance of 
being able to factor N. 


3) A less obvious constraint is that it must be that the “secret” exponent 
e is not too small. It is clear that if e = O(logN) for example, then by 
trying O(log N) times, we will find e, but in fact it can be shown that you 
must avoid having e < N\/4 (see Exercise 3-6.12 of Chap. IV). 


These relatively trivial remarks could cast doubt the security of the RSA 
system (see [17] for a more precise description of the catalogued attacks 
on the RSA system). However, theoretical support for it is provided by 
the following considerations. Let us call P the class of problems for which 
there exists a polynomial algorithm (for example the problem of deciding 
whether a number is prime is in P, by Agrawal-Kayal-Saxena). We can de- 
fine a class NP, a priori much larger than P, which is the class of problems 
for which there exists a polynomial verification (for example, the problem 
of factorization of a number is clearly in NP, since if we are given a factor- 
ization, we can verify it in polynomial time). However, the factorization 
problem has a subexponential solution. The security of the RSA system 
rests, from a theoretical point of view, on the hypothesis that the factor- 
ization problem is not in P. In fact, it is a special case of a large problem 
in complexity theory’: 


Is it true that P #4 NP? 


6. Error-Correcting Codes 


We give a glimpse of another industrial application of algebra and arith- 
metic: the construction of “error-correcting codes”, which can, to a certain 
degree, reconstruct a message if its transmission was slightly defective. This 
technique is for example needed to produce C'D readers, to transmit images 
by space probes, etc. If this introduction leaves you hungry to learn more, 
I recommend Demazure’s book, Cours d’algébre /3/. 


6.1. Generalities about Error-Correcting Codes 


In order to transmit information, we assume that we are using a finite 
alphabet 2, containing g symbols or letters and that we are sending words 
of a fixed length n; a word is therefore and element of 2”. We can think 
of binary language, ie., J := {0,1}, or of genetic codes, for example 2 := 
{A,C,G,U} (the bases found in RNA are A for adenine, C for cytosine, 
G for guanine and U for uracil). We will most often take the example of 


‘This problem P ¢ NP is one of the seven problems, for the solution of which a 
million dollars is offered by the Clay Mathematics Institute. 
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2 := Fy, which has the disadvantage of limiting the possible values of q 
but the advantage of providing a richer structure. 


The set of words 2” can be endowed with a Hamming distance, defined as 
follows. If = (#1,...,%,) € Q” and aw = (a4,...,x/,) € 2", then 


d(x, x’) := card{i € [1,n] | 2; 4 2;}. 
It can easily be checked that is in fact a distance. 


A code is a subset @ C 2” containing at least two distinct elements in 2”; 
we define the distance of a code as 
— ; / 
d(@) := as d(x, x"). 

Once we have chosen a code @, the principle consists of only sending those 
messages which belong to @. If we know that at most d(@)—1 transmission 
errors have been committed, then using the error-correcting code will enable 
us to establish the existence of one or more errors. Furthermore, if t errors 
have been committed during the transition of a word and if 2t+1<d(@), 
we see that there exists one single word in @ located at a distance < t from 
the received word. In conclusion, the code allows us to correct ¢ errors and 
we say that it is t-correcting. If we denote by d = d(@) the distance of the 
code and t = t(@) the number of errors that are systematically corrected 
by the code, we easily see that relationship between the two is given by 


— || and conversely d = 2t+1 or 2t+2. Except for some examples, 


we leave aside the question of decoding, which is essentially the study of 
algorithms which allow you to find the word of the code located at a minimal 
distance from a given word (it should be noted that you cannot in general 
guarantee the uniqueness of this word except under certain conditions). 
One of the properties required of a code is obviously that it corrects or 
finds the most possible errors (we could also insist that the decoding be 
the simplest possible). An intuitively obvious requirement is that it uses 
the least amount of space; we could formalize this idea by introducing 
the code rate t/n, and the information rate that we define as the ratio 
log card (@) /nlog gq. Information theory, developed by Shannon (see the 
founding article [67]), says that if we are willing to send longer and longer 
messages (i.e., to let n be very large), then there exist codes as safe we 
want them to be, with an information rate close to 1. Shannon’s theorem 
is however an existence theorem, it does not specify how to construct such 
codes. 


We are actually going to exclusively concentrate on linear codes, where the 
alphabet is (in bijection with) F,, the space of words is (in bijection with) 
the vector space (F,)" and @ is a subspace. In the case of g = 2, we are 
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talking about binary codes, in the case q = 3, we are talking about ternary 
codes, etc. 


The most important parameters of a linear code are the cardinality of the 
alphabet q = card Q, its length say n, its dimension say k := dim@, its 
distance d(@), its code rate and its information rate k/n. 


Remark. Let @ C Fj be a linear code. We define the weight of an element 
w(a) as the number of non-zero components of x. We can easily see that 


d(@) = juin d(0, 7) = pan we). 


6.1.1. Examples. 1) The most basic example of a code is the use of a 


parity bit: in order to transmit a word x = (%1,...,%n—1) € (F2)""+, we 
send = (@1,...,%n-1,01 +++: + Xn-1) € (F2)". To see if the received 
message x’ = (#1,...,@n) is correct, we check whether x, = 21+-+-+2%p_1. 


This code has length n and dimension n — 1. It allows us to find an error 
but not to correct it. Its distance is 2. 


2) Hamming code. Take the set of words with seven binary digits, q = 2, 
n = 7, and let @ be the code with basis 


1 0 0 0 
1 1 0 0 
0 1 1 0 
€9 = 1 3 ey = 0 ; €2 = 1 3 63 = 1 
0 1 0 1 
0 0 1 0 
0 0 0 1 


The coding principle is simple: in order to transmit a message m = 
(mo,™1,™M2,m3), we transmit « = mpeg + Mie, + Meg + M3e3. For 
this simple example, we will explain the decoding under the hypothesis that 
at most one error was committed. Equations of the vector subspace @ are 
given by 


L(x) = (ao +43 +25 4+ 26,01 +03 + 044+ 46,22 + 244+ 25 + 46) = 0. 


For each vector e of weight 1, we then calculate the triplet L(e). From 
this, we obtain the following algorithm of correction and decoding. After 
having received the message x = (,...,26), we check whether L(x) = 0. 
If L(x) = 0, the message is correct, if L(x) = (1,0,0), then x must be 
corrected, if L(x) = (0,1,0), then x; must be corrected and if L(x) = 
(1,0,1), then x5; must be corrected. Finally, if L(#) = (1,1,1), then x6 
must be corrected. Thus we have m = (29,20 + %1, U5, X6). 


We denote by T(a1,...,%7) := (%7,%1,...,26) the “shift”, so we have that 
T(eo) = e1, T(e1) = e2, T(e2) = e3 and T(e3) = eg + e; + €g. Thus 
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T(@) = @ (@ is then called cyclic). It is easy to see that each non-zero 
vector in @ has at least three non-zero coordinates, and therefore d(@) = 3. 
Therefore, this code is 1-correcting and allows us to identify two errors but 
not to correct them. 


An amusing example. The previous code suggests that it is possible to 
recover an element of F3 (or say an integer between 0 and 15) starting 
with an element of F5 (or say seven yes/no pieces of information) if at 
most one error has been committed (granted at most one of the bits of 
information is false). One version of this is the seven following questions 
which allow us to determine an integer N between 0 and 15. 

1) Is the integer N > 8? 

) Is the integer N in the set {4,5,6,7, 12, 13,14, 15}? 

) Is the integer N in the set {2,3,6,7, 10, 11,14, 15}? 
4) Is the integer N odd? 

) Is the integer N in the set {1,2,4,7,9,10,12,15}? 

) Is the integer N in the set {1,2,5,6,8,11, 12,15}? 
7) Is the integer N in the set {1,3, 4,6, 8, 10, 13,15}? 


We leave as an exercise the justification of the following algorithm. We 
denote the answers to the above questions by m = (m1,...,m7) (mj = 1 
if the 7th answer is yes, m; = 0 if not), and we compute ay = m4 + ms + 
Me + M7, dg = M+m3+ meg + m7 and ag = my +m3+ms5 + mz. If 
a, = a2 = ag = 0, we conclude that there is not an error, if not we change 
the rth answer m, into r = @d@9@3 (binary numeral notation), and the 
number we are looking for is therefore written 


N =mymam3mM4. 


We will now show how to characterize and construct codes and how to 
deduce new codes from the given ones by using elementary linear algebra. 
We denote by n the length of the codes and by k their dimension, unless 
specified otherwise. 


6.1.2. Definition. A generator matrix of a code @ is a matrix whose 
rows form a basis of @. (It is therefore a matrix of rank & having k rows 
and n columns.) A parity-check matrix of a code @ is a matrix whose rows 
form a basis for the linear forms which are zero over @. (It is therefore a 
matrix of rank n — k having n — k rows and n columns.) 


6.1.3. Remarks. Being given a generator matrix is of course equivalent 
to being given a basis of the vector space @, and given a parity-check 
matrix is of course equivalent to being given a basis of linear equations 
which define @ in Fj. If A is a generator matrix and B a parity-check 
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matrix, we easily see that A*B = 0, or also B‘A = 0. Moreover, we can 
recognize the distance of the code as the smallest number d such that there 
exist d dependent column vectors in B. 


Assume that we are given a code @ with parity-check matrix B and assume 
that the code is l-correcting. We show you how to decode a received 
message, x’, which is different in at least one coordinate from the sent 
message, x. First of all, if we denote the error committed by ¢ = x’ — x, we 
see that B(a’) = B(e). We will therefore compute B(x’); if this is non-zero, 
then no error has been committed, if not, we compute the images of the 
vectors e; in the canonical basis f; = B(e;). If only one error has been 
committed, we find a unique i such that B(z’) is proportional to f;, say 
B(«') = a;f;, and therefore ¢ = a;e; and « = a — aje;. 


If @ is a code of length n over the field F = F,, we can associate to it the 
following codes. 


i) Shortened code. Let d(@) < € <n. We set @ := {re Ff | 
(z;0,...,0) € @}. It is a code of length ¢, and we easily see that 
d(@) > d(@). 

ii) Extended code. We can create the analogue of the “parity bit” by con- 
structing @ := {(21,...,%n41) € Ber | (iserceta) Ge sand: esp 
-+++an,+2n41 = 0}. We can easily see that d(@) < d (@) <d(@)+1. 
One variation is the even subcode defined as @’ = {x E€@|a,+---+ 
Ln =O}. We have d(@) < d(@’). 

iii) Dual code. We define the scalar product (x,y) := a1y1 +++: + 2nYn, 
and we set @* := {2 € FU | Vx € @, (2,2) = 0}. We have that 
dim @* = n-—dim@. An interesting category of binary codes is that 
of self-dual codes, i.e., such that @* = @; such codes have dimension 
n/2, and the weight of an element is even since w(x) = (a, x) mod 2. 


As an exercise, you could try to figure out how to construct a parity-check 
(or generator) matrix of each of these codes, starting with the parity-check 
(or generator) matrix of the original code. 


6.1.4. Lemma. Let @ be a code of dimension k and of length n over Fy. 
The following inequalities hold: 

i) d@)<n+1-k; 

ti) if @ is t-correcting 1+ (7)(q-D+(5)(q-1)? +: +4({)(q-Di < gr. 


Proof. i) The vectors of the form (21,...,%n41—z,0,...,0) form a vector 
subspace J of (F,)”. Since dim J+dim @ = n+1, we see that ING F {0}, 
hence the existence of a non-zero vector of @ of weight < n+1—Ak. For 
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ii), we can observe that for every x € Fj and0 <t<n, 


card (B(2,t)) =1+ @ (q—1)+ (5) (gabe (") (q—1)*. 


If the code ¢t-correcting, the balls B(x,t) with center x € @ are disjoint and 
thus 


card (Uze¢B(a,t)) = q* card (B(0,t)) < q”. 


6.1.5. Definition. A code such that d(@) = n+1-—k is called MDS 
maximal distance separable. A t-correcting code such that @ = Ure¢ B(a,t) 
(forcibly a disjoint union) is called perfect t-correcting. 


The Hamming code of length 7 studied in the examples is perfect 1-correcting 
since, in this case, we can show that card B(x, 1) = 1+7 = 8 and 8card@ = 
27. We could also notice that this code is not MDS, because d(@) = 3 < 
4=n-k+1. 


6.2. Linear Cyclic Codes 


We will explicitly describe an interesting class of codes which in particu- 
lar contains some of the classical codes, such as that of Hamming, Reed- 
Solomon and Golay and which will lead us into the study of cyclotomic 
polynomials. 


6.2.1. Definition. A linear cyclic code is a linear code, @, of length n, 
which is stable under the transformation T (ag, @1,...,@n—1)=(Qn—1,40,---; 
Gn—2). 


We can give a nice algebraic characterization of cyclic codes by introducing 
the natural isomorphism of vector spaces F7 = Fa[X]n = Fq[X]/QF,[X], 
where F,[X], represents the polynomials of degree < n and where Q is 
a polynomial of degree n. Since the characteristic (or minimal) polyno- 
mial of the endomorphism T is Q = X” — 1, we therefore choose this 
value. Hence we denote by wy : Fj > Fa[X]n = Fq[X]/(X" — 1) defined 
as (a9, @1,---;@n—1) 3 Qo + ayX +++) + Gn_1X" 1 mod(X" — 1). We 
immediately see that 


oT (ao, 01,.-+;@n—1) = X (ao + a1X +--+ + @n-1X""") mod(X” — 1). 


Thus a vector subspace @ C Fis stable under T if and only if its image 
under w is stable under multiplication by X. We should point out that an 
F,, vector subspace of Fy[X]/(X” — 1) which is stable under multiplication 
by X is nothing other than an ideal of F,[X]/(X” — 1). Finally, the ideals 
of F,[X]/(X” — 1) correspond to the ideals of F,[X] which contain the 
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polynomial X” — 1 and therefore are of the form PF,[X] where P divides 
X” — 1. We summarize this discussion in the following theorem. 


6.2.2. Theorem. Let K:= Fy and let © be a cyclic code of length n. We 
identify K” with K(X|/(X" —1) via (ao, 4a1,..-,@n—1) ag tayX +--+ + 
an, X"1. There exist natural bijections between the following objects: 


t) a cyclic code of length n; 
ti) an ideal K[X]/(X" — 1); 
tit) a monic polynomial which divides X" —1 in K[X]. 


One of the bijections associates P, which divides X" —1, to the ideal @ of 
K[X|/(X"—1) generated by its class modulo X" —1, and another associates 
an ideal of K([X]/(X" — 1) to the vector subspace corresponding to @ of 
K”. Furthermore, dim @ = n — deg(P). 


This leads to the following problem: how to decompose the polynomial 
X” —1 in F,[X]? 

It is of course better to start with a decomposition in Z[X] (or Q[X]), 
which is provided by cyclotomic polynomials. In order to define these, we 
denote by pp, = {¢ € C | ¢” = 1} the group of nth roots of unity and 
i> the subset of nth primitive roots of unity, and hence cardi, = n and 
card us = d(n). 


We will need Gauss’s lemma. 


6.2.3. Lemma. Jf P = po + pp X +--+» + paX% € Z[X] is a non-zero 
polynomial, we define its content as c(P) := gcd(po,...,pa). We therefore 
have that 
c(PQ) = e(P)c(Q). 

Proof. By factoring P = c(P)P* and Q = c(Q)Q*, we see that c(PQ) = 
c(P)c(Q)c( P* Q*). So we have reduced the proof to showing that if P and 
Q are primitive (i.e., c(P) = c(Q) = 1), then c(PQ) = 1. If p is a prime 
number, we denote by P the image of P in F,[X]. We have that P 4 0 
and Q #0, thus P-Q = PQ #0 because F,,[X] is integral. So no p divides 
c(PQ), which implies that it is invertible. 


6.2.4. Corollary. Let P € Z[X]. Suppose that there exist Q,R € Q[X 
such that P = QR. Then there exists \ € Q* such that AQ and \~!R have 
integer coefficients. 


Proof. We can write Q = 5a (resp. R = 7 Ru): where a,b,c,d are 


integers and where Q; and R, are primitive polynomials with integer co- 
efficients. We can deduce from this that bd P = acQ,R, and, since the 
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equality is in Z[X], we can deduce, using Gauss’s lemma, that bdc(P) = ac 
and, in particular, that bd divides ac. Thus P = c(P)Q, Ri. 


6.2.5. Corollary. Jf a € C is a root of a monic polynomial with in- 
teger coefficients, then the minimal (monic) polynomial of a has integer 
coefficients. 


Proof. Let P, a priori in Q[X], be the minimal polynomial of @ and let 
Q be monic with integer coefficients such that Q(a) = 0. Then Q = PR, 
where R is in Q[X]. Gauss’s lemma says that there exists 4 € Q* such 
that Ro = \R and Py = \~'P have integer coefficients. By observing that 
Q = PoRo, it follows that the leading coefficient of Py is invertible, and 
hence P = +P, has integer coefficients. 


6.2.6. Definition. The nth cyclotomic polynomial, denoted ®,,, is defined 
as 
,(X) = [[ (x-9). 


Cen, 


These polynomials, a priori with complex coefficients, in fact have integer 
coefficients and moreover provide a decomposition of X”—1 into irreducible 
factors, as shown in the following theorem. 


6.2.7. Theorem. The polynomials ®,, have the following properties. 


i) ®, € Z[X] and deg ®, = G(n). 
ti) X" —1= [Tain Pn(X). 
tit) The polynomials ®,, are irreducible in Z[|X] and in Q[X]. 


Proof. With the given definition, ®,, € C[X]. Formula ii) is clear, as well as 
the fact that deg(®,,) = ¢(n); however it is less clear that in fact ®, € Z[X] 
and that ®, is irreducible in Q[X] (or Z[X]). We shall start by showing 
that the coefficients of ®,, are integers. It is clear that ®;(X) = X —1leé 
Z[X], and formula zi) leads us to try induction on n. The polynomial 
Bs: [Tai peo ®,(X) is monic and, by applying induction, has integer 
coefficients. We can therefore carry out the division algorithm in Z[X], and 
obtain X"—1 = BQ+R. Formula ii) then guarantees that B divides R (in 
Q[X]), so R= 0 and Q = ®,,. We will now show that ®,, is irreducible in 
Z|X]. Let ¢ be a primitive nth root of unity and P its minimal polynomial 
over Q. We therefore need to show that P = ©®,. First, observe that 
P € Z[X]. Then choose a prime number p which does not divide n, so ¢? 
is still an nth primitive root of unity. Let @ be its minimal polynomial, 
which is also in Z[X]. If P and Q were distinct, the product PQ would 
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divide ®,. But since Q(¢?) = 0, we see that ¢ is a root of Q(X”) and thus 
Q(X?) = P(X)R(X), for some R € Z[X]. By reducing the coefficients 
modulo p, we have 


Q(X?) = Q(X)? = P(X) R(X), 


and so P(X) divides Q(X)? in (Z/pZ)[X]. Moreover, the factors of X" —1, 
and hence of P(X), are simple in (Z/pZ)[X] (the derivative of X”" — 1 is 
nX”~—1, and we made a point of choosing p so that it does not divide n): the 
polynomial P(X) in fact divides Q(X). But then, P(X)? divides ®,(X) 
in (Z/pZ)[X], which contradicts the fact that the factors of ®,(X) are 
simple. To summarize, we have established that if p is a prime number 
which does not divide n, the minimal polynomial of ¢ kills ¢?. We easily 
deduce from this that if m is relatively prime to n, then P(¢™) = 0. Thus 
deg(P) > ¢(n) and since P divides ®,, we have that P = ®,, and it is 
therefore irreducible. 


Since ®,, has integer coefficients, we can reduce its coefficients modulo p 
and consider it as a polynomial in F,[X] (or in F,[X] with g = p/). 


6.2.8. Theorem. The decomposition into irreducible factors of the poly- 
nomial ®,, € F4[X] (with q = pf) depends on whether n modulo p is zero 
or not. 


i) Ifn=p*m where p fm, we have ®,(X)=8,,(X)P-P 

a) If gcd(n,q) = 1 and if r is the order of qmodn in (Z/nZ)*, then ®, 
can be decomposed into the product of o(n)/r distinct irreducible factors 
of degree r. 


Proof. Assume first that n = p’m. By Fermat’s little theorem and 
the formulas from Exercise 2-7.12, it follows that ®,,(X)? = ®,,(X?) = 
®imnp(X)®n(X), hence ©,,,(X) = ®,,(X)?~*, and subsequently that 


a= 


Pinpr(X) = Omp (XP) = Op XP = p(X OY, 


which proves the first assertion. From now on, suppose that p is relatively 
prime to n. Let @ be an nth primitive root in an extension of Fy. Every 
factor of ®,, can be written as Q = [],<,;(X — 6"), with I C (Z/nZ)*. The 
polynomial Q has coefficients in F, if and only if 


Q(X)* = Q(X"). (x) 
\d 
In fact, a; a;X?) = 20; (a;)7X% and a € F, if and only if a7 = a. Thus 
the polynomial @ has coefficients in Fy if and only if 


[[(«*- 6 =][& - 6)" =] ] (x - 6), 


iel 1el ie. 
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or even if and only if J is stable under multiplication by g (in (Z/nZ)*). 
The smallest stable subset is clearly of the form I := {i,iq,iq?,...,iq”+}. 
Also, the irreducible factors of ®,(X) in F,[X] are of the form 


o= [Ix — p't) 


and, in particular, all have degree r. 


6.2.9. Examples. 1) Take n = 11 and q = 3; we see that the order 
of 3mod 11 is equal to 5. Thus X'! — 1 = (X — 1)®,;(X) in Z[X] and 
®,, = P,P, € F3[X], where deg(P;) = 5. We can check that, in F3[X], 


XM 15 (X —1)(4* — X39 4+ X? -— X = 1)( XP + X*- X84 KX? = 1). 


2) Take n = 23 and q = 2; we see that the order of 2 mod 23 is equal to 11. 
Thus X73 — 1 = (X — 1)®93(X) in Z[X] and ®23 = P,P, € F2[X], with 
deg(P;) = 11. We can check that, in F2[X], 


X78 1 = (XK —1)(X" + X14 XS + XO 4 Xt + X72 +1) 
x (XT 4 Ko 4 KT 4 XS 4 XP 4 KX +1). 


3) Take n = 15 and q = 2; thus X — 1 = (X — 1)®3(X)65(X)®15(X) in 
Z[X], with 63, = X8— X74 X5— X44 X3— X41. The order of 2mod3 
is equal to 2, the order of 2mod 5 is equal to 4 and the order of 2mod 15 is 
equal to 4. The polynomials ®; = X?+ X +1 and 6; = X*4+ X34 X?4 
X +1 are therefore irreducible in F2[X], and ®15 = P, P2 € F2[X], where 
deg(P;) = 4. We can check that, in F2[X], 


XM_1 = (X-1)(X?7+X41)(X4A4KX84K74K4-1)( X44 KF 41) (KA+ X41). 


4) More generally, if gcd(q,n) = 1, a cyclic code of length n corresponds, by 
Theorem 2-6.2.2, to a subset I C Z/nZ, which is stable under multiplica- 
tion by g. More explicitly, the associated code is the ideal of Fy[X]/(X"—1) 
generated by the polynomial Q = [],<;(X — B"), where 3 is an nth primi- 
tive root of unity. To estimate the distance of such a code, we can use the 
following result. 


6.2.10. Theorem. Let @ by a linear cyclic code of length n over Fy 
associated to I C (Z/nZ). If there exist i and s such that {i + 1,i + 
2,...,44+s} CTI, thend(@) >s+1. 

Proof. Let 3 be an nth primitive root in an extension of Fy and let Q be a 
polynomial modulo X"—1 which belongs to @. We know that Q(6't7) = 0 


for 7 = 1,...,s. Assume that the weight w of Q (viewed as an element of 
F”) is < s, which means that Q = a,X" +---+a,X‘ with 0 < i, < 
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tg < +++ <%, <n. We need to show that Q is in fact zero. Now, we 
have the equations a, 64+) +... + a,@%(+) = 0 for j =1,...,s. Let 
a’, = a1", ..., a, = ay". The equations can be rewritten as 


Bsa +--+ Btal, =0, for j =1,...,58. 


The matrix of the 3*" can be extracted from a Vandermonde matrix with 
BY # B', because 3 has order n, and its rank therefore equals w = 
min{w,s}. This means that aj =---=a/,, = 0, and hence aj = --- = ay = 


0. 


6.2.11. Remark. The bound given in the theorem is generally not opti- 
mal. We can see this below in the example of Golay codes. 


6.2.12. Examples. (Linear cyclic codes.) 


We will now describe in detail some examples gotten from choosing q,n 
and a subset I C Z/nZ which is stable under multiplication by g. To be 
rigorous, we should clarify that the code that we construct also depends on 
the nth primitive root 3 that we choose. However, it is not difficult to see 
that the various codes gotten from the choices of (@ are all isomorphic. We 
will therefore omit (. 


Hamming codes. One first interesting choice of parameters is n = (q” — 
1)/(q — 1), and we can easily check that the order of gmodn is r. We set 
I :={1,q¢,¢7,...,q'}, which defines a code @ of dimension n — r (once 
GB, a primitive nth root of unity, is chosen). We will now directly verify 
that d(@) > 3. A polynomial of weight 2 can be written f = aX* + bX! 
with say 0 < i <j < n-—1, and the condition that it is killed by BY 
for 0 < € < r—1 is therefore written as a + pBG-aa = 0. Since @ is 
of order n, we see that this is impossible except when a = b = 0. Thus 
the code @ is 1-correcting, and since card B(z,1) = 1+ n(q-1) = qq’, 
we see that @ is perfect 1-correcting, and thus d(@) = 3 or 4 (we show 
below that the distance is 3 and that the code is therefore MDS if and only 
if r = 2). Binary Hamming codes are obtained by taking gq = 2 and by 
choosing I := {1,2,4,...,2"-1} and hence k = n—r = 27 —r—1. Since 
{1,2} C I, we see that d(@) > 3. For r = 3, ¢g = 2, n = 7, we get the code 
studied in the first example (2-6.1.1). 


In order to see that the distance of a Hamming code is equal to d(@) = 3, 
we write a parity-check matrix A for the code (a matrix with r rows and n 
columns). The columns ¢),...,¢€, of A are vectors in (F,)", and we have 
just shown that any pair of them is linearly independent. Now, there are 
n = (q’—1)/(q—1) of them, and they therefore represent exactly one vector 
from each line in (F,)". Since two of the vectors e; are never dependent, 
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but of course there exists triples of linearly dependent vectors, we see that 
d(@) = 3. 
Reed-Solomon codes. These codes correspond to the choice n = q—1, 
most often with g = 2/. Let a be a generator of Fj. Once we have chosen 
k, we set 


g(X) := II (xX a’). 


q—1-k 
=1 


It follows of course that k = dim @ and, since J = {1,2,3,...,g—1—k}, we 
have d(@) > q—k. But we know that for every linear code, d(@) < n+1-—k, 
hence d(@) = q—k, and the code constructed in this way is therefore MDS. 
Now suppose that q = 2/. We can consider @ as a binary code @’, with 
the parameters n’ = (2 — 1)f, k’ = kf and distance d(@’) > 2/ —k. One 
special feature of this code is that it can correct large numbers of errors: 
if t satisfies 2+ 1 < d(@) = q—k, the code can correct t elements of 
F.;, hence tf binary errors if these errors are distributed in bunches! This 
feature explains why this type of code is used in the technology of compact 
discs. 


Ternary Golay code. We know that 35 — 1 = 11-23. We choose 
q = 3, n = 11 and the subset of (Z/11Z)* generated by 3, in other words 
I := {1,3,4,5,9}; this code, denoted by %1, is therefore of dimension 
6. We point out (but do not use) that J = F{?. By Theorem 2-6.2.10 
on the distance of a cyclic code, we see that d(%1) > 4 and, by con- 
sidering the factorization of ®,, in F3[X] (cf. Examples 2-6.2.9), we see 
that 41 contains a polynomial of weight 5, hence d(@1) < 5. An exten- 
sive calculation (which is postponed to Exercise 2-7.22 below) allows us 
to establish that actually d(@1) = 5. Thus 4%, is 2-correcting, and since 
card B(x, 2) = 1+2(*') +2? Gy) = 3°, it is clear that the code 4, is perfect 
2-correcting (but notice that it is not MDS). 


Binary Golay code. We know that 2'' — 1 = 23-89 (it is actually the 
smallest number of the form 2? — 1 which is not prime). We therefore 
choose q = 2, n = 23 and I as the subset of (Z/23Z)* generated by 2, 
in other words I := {1,2,3,4,6,8,9, 12,13, 16,18}, and we denote by %3 
the associated code. Observe also that J = F33. By Theorem 2-6.2.10 on 
the distance of a cyclic code, we see that d(%3) > 5 and, by considering 
the factorization of ®23 in F2[X] (cf. Examples 2-6.2.9), we see that %3 
contains a polynomial of weight 7, hence d(3) < 7. An extensive calcu- 
lation (which is postponed to Exercise 2-7.22, suggested below) allows us 
to determine that actually d(%3) = 7. Thus %3 is 3-correcting, and since 
card B(z,3) = 1+ (7?) + (2) + 7) = 24, it follows that the code %3 is 
perfect 3-correcting (but notice that it is not MDS). 
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6.2.13. Remark. We can show that if we exclude trivial codes (i.e., of 
dimension 1, n — 1 or n), the only perfect t-correcting codes are those that 
we have already constructed: the Hamming 1-correcting codes and the two 
Golay binary and ternary codes [73]. 


7. Exercises 


7.1. Exercise. (Newton’s method) Recall that Newton’s iterative method 
(for approximating the zeros of a function) is applicable to differentiable 
functions. Let f be a function with a unique zero at a; the iteration is 
given by 

f'(tn) 


The rate of convergence of this approximation is quadratic, i.e., |@,41—a| < 


En+1 = In — 


C\rn—a|?. Clarify and prove this assertion for the function f(x) := x™—a, 
and deduce a fast calculation algorithm for approximating ‘%/a from this. 


7.2. Exercise. 1) Give a fast algorithm which checks if a given integer N 
is a power a™, where m > 2. 


2) If we now want to test whether N = p™ where p is prime and m > 2, 
we take a € [2,N — 1] and we test if gcd(a,N) = 1. If that is the case, 
we compute d = gcd(aN~1—1, N). Prove in this case that p divides d and 
that, with a high probability, d 4A N and also that d = p. Deduce from this 
an algorithm to check whether N = p™. 


7.3. Exercise. (Multiplication algorithm—see [42]) Suppose that the 
integers m and n are written in at most 2t binary digits, n = n12'+no and 
m =m,2' + mo. Observe that 


mn = myn (27 — 2°) + 2*(mi + mo)(n1 + no) + mono(1 — 2°) 


and can therefore be calculated with three multiplications of numbers of 
size t and some additions and shifts (multiplication by 2 consists of one 
shift of digits). Deduce from this an algorithm, where the cost T(r) of the 
multiplication of two numbers with r digits satisfies 


T(2r) < 3T(r) +cr, 


for some appropriate constant c. Deduce from this that T(r) = O(r®), 
where a > log3/log2. (Notice that, asymptotically, this algorithm is better 
that the usual algorithm, whose complexity is O(r?).) 
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7.4. Exercise. (Multiplication by fast Fourier transform) In this exercise, 
we will give a theoretical presentation of the finite Fourier transform, which 
will allow us to multiply very large numbers faster than the usual algorithm. 
The hints are fairly brief, so you could also use a specialized reference, [42] 
Sect. 4.3.3., to help you finish this exercise. 

Let N > 2 be an integer and let A be a ring. We identify the set E of 
functions from Z/NZ to A with the set of polynomials with coefficients in 
A of degree < N, in other words, to polynomials associated to the ring 
A[X]/(X* —1). Ifa = (ai)ocicn—1 is a sequence indexed by Z/NZ, we 
denote by P, the corresponding polynomial. We define a “convolution” by 
(a*b); = eee ajbn, and we can easily check that Pasxb = PaPp. 


If ¢ is an Nth primitive root of unity in A, we define the “Fourier trans- 
form”, F:E— E and its conjugate # : E > E by the formulas 


(Fa); = » Ca; = PC) and (Fa); = Le Ca; — PC): 


i€Z/NZ i€Z/NZ 


1) Prove that the following formulas hold: ¥ (axb) =F (a)-F(b), 
Na and F (Fa) = Na. 

2) Whenever N = 2N', we set ¢’:= C2 and E' := A[X]/(X%' —1), and we 
define #', #' : E' — E' with the help of ¢’. Fora € E, we define a°,a! € 
E’ by setting a? = ag; and a} = aa;41. Check that, for 0 < j < N’—1, the 
following formulas hold: 


(Fa), =(F'a°) +0 (F'al), and (Fa)yry; = (F'a°) ,-C (Fa), 


% 

& 

& 
I 


3) Now suppose that N = 2". Use the previous arguments to derive a recur- 
sive procedure for calculating a Fourier transform. If we denote by M(r) 
the number of multiplications and A(r) the number of additions necessary 
to carry out this procedure, show that A(r)+ M(r) = O(r2") = O(N log N). 
4) By using the first formula (convolution transformation and ordinary 


product) and the preceding results, derive a multiplication algorithm for 
polynomials with coefficients in A. 


5) The choice of a numeral basis b lets us write integers in the form P,(b) = 
ag +a,b+---+agb4. Using the polynomial multiplication algorithm, derive 
an algorithm for multiplying integers. 


7.5. Exercise. A Fibonacci sequence of integers is defined by uo = a, 
uy = b and Un = Un—1 + Un—2 for n > 2, where 1 <a <b are integers (the 
classical Fibonacci sequence corresponds to a = b = 1). 

1) Prove that log |un| ~ nlog (44%). 
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2) Prove that gcd(un41,Un) = gcd(b,a) and that the Euclidean algorithm 
gives this result inn steps. Deduce from this that the complexity estimation 
given at the beginning of this chapter is generally optimal. 


7.6. Exercise. Prove that following algorithm allows us to calculate the 
gcd of two integers, and estimate its complexity. If n and m are even, 
factor out 2; if n is even and m is odd (or conversely), replace n by n/2; 
ifm and n are odd, replace n by (n — m)/2. 


7.7. Exercise. Let M € Z and let N be an odd positive integer. Prove that 
the Euclidean algorithm, together with the quadratic reciprocity law, gives 
a fast algorithm (and estimate its complexity) for calculating the Jacobi 


symbol ( +) : 


7.8. Exercise. Let M := 85; we define the sets Go := (Z/MZ)*, 

Gy = fae Go. | a4 = 1), Go tS -{a- € Go| a“? => £11, 

G3 := {a € Go | a4 Y= (+) } and finally S := {a € Go | a = 
a 

1 or a*4 = —-1 or a” = —1}. 


3.a) Prove that if a € S, then —a € S, and use this to deduce that the 
cardinality of S is even. 


3.6) Calculate the cardinality of Go, Gi, Go and S. 
3.c) Use this to find the cardinality of Gs. 
3.d) Is the set S a subgroup of Go? 


7.9. Exercise. Forn > 2, we denote by ®,, the nth cyclotomic polynomial. 
1) Recall how to decompose ©,, in F,[X]. 


2) Leta € Z and let p be a prime number which does not divide n but which 
divides ®,(a). Prove that p= 1modn (you could start by observing that 
the class of a modulo p is a root of ®,,). 


3) Prove that ®,,(0) = 1 and deduce from this that for allm > 2, ®,(m) is 
relatively prime to m. Also prove that there are only finitely many a € Z 
such that ®,(a) = +1. 


4) Deduce from this (without using Dirichlet’s theorem on arithmetic pro- 
gressions) that there exist infinitely many prime numbers, p such that p = 
lmodn (resp. infinitely many prime numbers p such that p# 1modn). 


7.10. Exercise. Let G be a finite abelian group. 
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1) Prove that there exists an integer N such that G is isomorphic to a 
subgroup (resp. a quotient) of (Z/NZ)*. 


Hint.— We can reduce to the case where G = Z/mZx---xZ/n.Z. By using 
the result proven in the previous exercise, we can choose prime numbers 
pi =1modn,, and show that N := p,---ps works. 


2) (This question requires some knowledge of Galois theory, see for example 
Appendix C, in particular Examples C-1.1.) Prove that there exists a finite 
Galois extension, K/Q, such that Gal(K/Q) = G 


7.11. Exercise. Let P = X+4+1. We will study its factorization over 
various fields. 


1) Prove that P is irreducible in Q{X] and calculate its factorization over 
the fields Q(i), Q(V2) and Qliv2). 
2) Show that for every prime number p, P is not irreducible over Fp. 


Hint. Construct a factorization by using the fact that —1, 2 or —2 is a 
square. Variation: observe that P = ®g and invoke Theorem 2-6.2.8. 


7.12. Exercise. 1) Prove that the following relations hold (you could 
compare the degrees and the roots of both sides): 
Bnp(X) if p divides n, 


&,,(X?) = 
(x”) ia coe if p does not divide n. 


2) Prove that ®yr = XP" (PY) 4 XP "-2) 4.0.4 XP" $1 (forr > 1). 


7.13. Exercise. For n > 3, we denote by ®*(X) the monic polynomial 
with the property that (ot (x We = Tees (X-—¢-¢"1). 
1) Compute ®}, &F and OF. 


2) Prove that deg 6+ = ¢(n)/2 and ®,(X) = X*™/2@t(X + X71). De- 
duce from this that ®*(2) = ®,(1) = p. 


3) Prove that ®* is in Z[X] and is irreducible (in particular, it is the 
minimal polynomial of 2 cos(21/n)). 


7.14, Exercise. Let P = |Jj_,(X — ai) and Q = JJ5_\(X — 6;) be two 
polynomials in K[X]. We define their resultant by the formula 


res(P,Q) : = [1a = =e 
j=1j=1 


We refer you to a classical algebra text (cf. for example [43]) to see how 
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res(P,Q) can be expressed as a determinant in the coefficients of P and Q, 
which shows in particular that res(P,Q) € K and, more generally, that if 
P,Q € A[X], then res(P,Q) € A. 


1) Prove that res(Q, P) = (—1)"* res(P, Q). 
2) We will assume from now on that P,Q € Z[X], and we choose gq to be an 


odd prime. We denote by P (resp. Q) the reduction modulo q of P (resp. 
of Q). Prove that the class of res(P,Q) modulo q is equal to res(P, ane 


3) Prove that ot = (X — 2)@-D/? in F,[X] (®t is defined in Exercise 
2-7.18). 


4) Use the previous questions and question 2) of Exercise 2-7.13 to show 
that if p and q are distinct odd primes, then 


res(@t, &*) ape = (4) od q. 


5) Prove that res(®},®7) = |] ~-1)/26 (nm) and deduce from this 
that res(®}, ®F) € {+1,—-1}. 


6) Prove that the following formula holds, 


res(0},05) = (4), 


NEMG 1) 


and use this to give a proof of the quadratic reciprocity law. 


7.15. Exercise. Let N be an odd integer. 
1) If its factorization can be written as N = py" ---p,’*, where pj —1 = 
2°'L, and L, are odd, prove that 
card{a € (Z/NZ)* | ord(amod N) is odd} _ 
card{a € (Z/NZ)*} 


2781 Sk 


2) Deduce from this that if we had a fast algorithm, PY, which calculates the 
period (the order of amod N ), then we have a fast probabilistic factorization 
algorithm. 

Hint.— Randomly choose a, test to see whether gcd(a, N) = 1, then whether 
the period P(a) is even; in this case compute gcd(a?)/? + 1, N). 


7.16. Exercise. Prove that 2™ +1 can only be prime ifm = 2". Set 


F,, := 2?" +1 (known as a Fermat number). Prove that F,, is prime if and 
F,-1 


only if F, divides3 2 +1. Check that Fo, Fi, Fo, F3 and Fy are prime, 
but not Fs (which is divisible by 641). 
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7.17. Exercise. (Lucas test and Mersenne numbers) Start by proving 
that M, := 2” —1 can only be prime if n is itself prime. Check that 
Mp2, M3, Ms, Mz are prime, but that My, is not prime. The numbers M, = 
2? — 1 are called Mersenne numbers. In this exercise, we ask you to prove 
the Lucas primality test for these numbers. 

a) We define a sequence with values in a ring A by Vo = 2, Vi = a and 
Vn4+1—-AVn+Vn—-1 = 0. Verify the following formulas: Van—1 = VnVn—-1—4, 
Von = ve — 2, and also ViVim = Vntm — Vn—m- 

b) Let M be odd, a an integer such that gcd(a? — 4,M) = 1 and V,, the 


sequence defined above. If Viz41 = 2mod M and if for every prime number 
q which divides M +1 we have gcd(V mw41 — 2,M) = 1, prove that M is 
q 


prime. 


c) We define the following sequence by Ly :=4 and L;41 := L? — 2. Prove 
that the Mersenne number M, is prime if and only if Lp_1 = 0 mod M,. 


7.18. Exercise. (Perfect numbers) This nice problem has been handed 
down to us from Euclid: we say that an integer is perfect if it 1s equal to 
the sum of its proper divisors, symbolically: 


n= S- d or In =o(n):= Sod. 


d | n d|n 
d#n 
a) Show that if Mp = 2? — 1 is a prime Mersenne number (cf. previous 


exercise), then Py, := 2?-'M, is a perfect number (this fact as well as the 
examples P, = 6, P3 = 28, Ps = 496 were known to Euclid). 


b) Prove the following result due to Euler: an even perfect number n is of 
the form Py. 


Hint.— Writen =2™M with M odd and m > 1; prove that 2n = 0(2™)o(M) 
and deduce from this that M must be prime, then finish the exercise. 


Remark. Nobody knows whether there exists an odd perfect number; it 
is generally conjectured that there do not exist any and that the perfect 
numbers are in bijection with the prime Mersenne numbers. 


7.19. Exercise. (Pocklington-Lehmer test or certificate) Let N > 2. 
Suppose that N — 1 is (partially) factored as N —1 = p{'---p,*M, with 
M <WVN, and moreover that for each p;, we have an a; such that 
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Use this to show that if q divides N, then q = 1modp;', and also that N 
is prime. 


7.20. Exercise. Let 3 be a 17th primitive root of unity in an extension 
of Fz. We let I := F 7? and set 


F(X) =]] (x-6'). 
wel 
Prove that the polynomial f(X) defines a cyclic code @ of length 17, and 


calculate its dimension and bounds on its distance d(@), for example 3 < 
d(@) <6. Then give the exact value of d(@). 


7.21. Exercise. 1.a) Describe the degrees of the decomposition into irre- 
ducible factors of X®° —1 in Q[X]. 
1.b) Give the number of irreducible factors, as well as their degrees, of the 
decomposition of X°®° —1 in F2[X]. 


1.c) Explain how to construct a binary cyclic code of length 85 and dimen- 
sion 64. It is possible to construct such a code with dimension 63? 


7.22. Exercise. (Where we show that d(%,) = 5 and d(%3) = 7 and 


use the notion of a self-dual code.) 


A) Let @ be a cyclic code of length n generated by the polynomial g = g(X) 
of degree d. Let @' be its even subcode @* its dual code. 


1) Prove that @' = @ if and only if g(1) =0. If g(1) £0, check that @’ is 
cyclic and generated by the polynomial (X — 1)g(X). 

2) Prove that @* is cyclic and generated by the polynomial h*(X) = X"~¢ 
h(1/X) where g(X)h(X) =X” — 1. 

Hint. You can show that if deg(f) < n—d-—1 and deg(e) < d—1, then 
(fg,eh*) is equal to the coefficient of X"~' in the product f(X)g(X)e*(X) 
h(X) = f(X)e*(X)(X” — 1), and is therefore zero. 

B) Suppose that @ C @* (i.e., for allz,y € @, we have (x,y) = 0). 

1) If q = 2, prove that for all x,y € @, we have w(a+y) = w(a) + 
w(y) mod 4. 


2) If q = 3, prove that for all x,y € @, we have w(x + y) 
w(y) mod 3. 
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C) We introduce the subcode 9 of YJ, composed of vectors whose sum of 
the coordinates equals zero (the “even” subcode). 


1) Prove that if g(X) is the generating polynomial of G1, the code F is 
cyclic and its generator is (X — 1)g(X). 
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2) Prove that 9 c  (i.e., for every x,y € F we have (x,y) =0). Deduce 
from this that for every x € J, we have w(x) = 0mod3. 


3) We denote by J and G1 the extended codes. Set e1, = Css a, Le FE 
and e€12 = (1,...,1) € F3?. Prove that e11 € G1, e12 € Vi and hence 
#1 = DOF 3e12. 

4) Prove that 4, is self-dual. Deduce from this that for every L,Y € G1, 
we have w(a + y) = w(x) + w(y) mod38, and hence that d(¥1) = 0mod3. 


5) Knowing that 4 < d(%i1) <5 and d(C) < d(C) < d(C) +1, conclude 
that d(41) =5 andd (A) = 6. 


D) Let p be an odd prime such that (2) =1,S9:= FY? and © a binary 


code of length p which corresponds to the set S (which, by hypothesis, is 
stable under multiplication by 2). We denote by @ the extended code of 
length p+ 1. 

1) If g = g(X) is a generator of @ and if g*(X) = X-V/2g(1/X) is its 
reciprocal polynomial, show that g(X) = g*(X) if p = 1 mod8, and that 
®,(X) = g(X)g*(X) if p= —1 mods. 


2) We suppose from now on that p= —1mod8. Prove that € is self-dual 
(i.e., @ =G@*, or for all Z, 9 € @, we have (%,¥) = 0). 


3) Lett = Diep X* and y = Vie; X'. Show that (x,y) = |IN J|mod2 
and that w(a+y) = |I|+|J|—2|INJ|. Conclude from this that if (x,y) = 0, 
then w(a + y) = w(x) + w(y) mod 4. 


4) Use the previous question to show that if Z is a self-dual code generated 
by the elements whose weight is a multiple of 4, then every element of D 
has weight which is a multiple of 4, and in particular, d(Y) = 0mod 4. 


5) Apply the preceding questions to the case p = 23. Observe that if g is the 
generator of © = %3, we have w(g) =7, so w(g) = 8. Conclude from this 


that d(C) = Omod4. Knowing that 5 < d(%s3) <7 and d(C) < d(C) < 
d(C) +1, deduce that d(%3) =7 and d (3) = 8. 


Chapter 3 


Algebra and Diophantine 
Equations 


“..it is a thing of beauty and of joy for ever...” 


JAMES JOYCE 


In this chapter, we address some classical problems in number theory, such 
as finding integer solutions to polynomial equations. The examples that we 
will look at cover three large topics. 


1) The decomposition of an integer n into the sum of two, three or four 
squares, in other words, the search for solutions of the equation n = x7 + 
r++ + a2. 

2) “Fermat’s last theorem” (proven by Andrew Wiles in 1995): the only 
solutions to the equation x” + y” = z” for n >3 are the trivial ones (i.e., 
xyz =0). 

3) Solutions to the Pell’s equation x? — dy? = 1 (or more generally x? — 
dy? = n). The study of congruences—the theme of Chap. 1—gives us 
necessary conditions for the existence of solutions to such an equation. The 
methods introduced in this chapter are the use of rings more general than 
Z and also results about rational approximations. 


To be more precise, we will study rings such as Zi], Zlexp(2ri/n)], Z[Vd] 
and even the noncommutative ring of Hurwitz quaternions, a subring of the 
division ring of quaternions defined by Hamilton. On the other hand, we 
will have a look at how fast a sequence of rational numbers can converge to 
a real number. 


We will finish with an outline of the main properties of these rings by in- 
troducing some supplementary notions from algebra: algebraic integers and 
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Dedekind rings and from the geometry of numbers: lattices and Minkowski’s 
theorem. 


1. Sums of Squares 


We want to find out under which conditions an integer n € N can be 
written as the sum of squares. Let us first have a look at some constraints 
that we can write in terms of congruences. 


We know that x? = 0 or 1 modulo 4, thus a number n = 4n’ +3 cannot be 
the sum of two squares. To be more precise, notice that if p = 3mod4 and 
p divides n = 27 +y?, then p must divide y. This is true because otherwise, 
we could write (xy~')? = —1modp and then deduce that —1 is a square 
modulo p, which cannot be true. Since p divides y, it also divides x, and we 
can conclude that x = pz’, y = py’ and n = p’n’. By repeatedly applying 
this argument, we see that if p= 3mod4 and if n = p?**!m, where m and 
p are relatively prime, then n is not the sum of two squares. 


Notice that if x is even, then x? = 0 or 4 modulo 8, whereas if x is odd, 
then x? = 1 modulo 8. We then see that x? + y? + 2? is never congruent 
to 7 modulo 8. We can slightly refine this argument: if n = 4n’ and if 
n=2?+y?+4+2?, then we see that x,y and z must be even, hence x = 22’, 
y = 2y' and z = 22’, with n’ =a2+y2+2z2. By repeatedly applying this 
reasoning, we see that ifn is of the form n = 4*(8m+7), then n is not the 
sum of three squares. 


It is a remarkable fact that the obstructions given by these congruences 
are, in the case of the sums of squares, the only ones. 


1.1. Theorem. (Two-square theorem) An integer n € N is the sum of 
two squares if and only if every prime number p congruent to 3 modulo 4 
appears with an even exponent in the decomposition of n into prime factors. 


1.2. Theorem. (Three-square theorem) An integer n € N is the sum of 
three squares of integers if and only if it is not of the form n = 4°(8m+7). 


1.3. Theorem. (Four-square theorem) Let n € N, then there exist inte- 
gers, £,y,2,t such that n= a2? +y? 4+ 2748. 


We are going to postpone the proof the second theorem until later (see, 
for example, Serre’s book [8] or Exercise 3-6.8 together with the Hasse- 
Minkowski theorem 6-3.18 and its Corollary 6-3.19, or also Exercises 3-6.9 
and 3-6.10). To prove the first theorem, we introduce the ring Z[#], and to 
prove the third, we introduce the ring of Hurwitz quaternions. 
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We can immediately see from the statements of these theorems that the set 
of sums of two squares (resp. of four squares) is stable under multiplication, 
but not the set of sums of three squares. For example, 18 = 2-3? = 
4741241? and 14=2-7= 37427417, but 18-14=4-9-7 is not the 
sum of three squares. The multiplicativity of the set of sums of two (resp. 
four) squares can be explained by the following formulas: 


(x? + y*)(a? + b?) = (ax — by)? + (ay + ba)? 


and (er? +y?4+224+0)(74+P?4+7+d)= 


(ax—by—cz—dt)?+(ay+br—ct+dz)*+(az+bt+ca—dy)? +(at—bz+cy+dz)*. 


The origin of these formulas will be clear once we give an interpretation of 
them in Z[#] or in the quaternions. 


If we set 6 :={n EN | dz,yEN, n=27+y7} and 


€,:={neEN |dz,y,z,tEN, n=22%4+y%4+274+#}, 


we see that it is enough to show that every prime number which is congruent 
to 1 modulo 4 is in @ and that every prime number is in @. 


We are going to construct the classical example of a noncommutative divi- 
sion ring, the ring of quaternions discovered by Hamilton, and elaborate on 
its arithmetical properties to establish a proof of the four-square theorem. 


The most concrete of the constructions of the ring of quaternions undoubt- 
edly consists of endowing the 4 dimensional real vector space with basis 
1,1, J, and defining a bilinear multiplication on it, where 1 is the multi- 
plicative inverse and which satisfies 


P=P=K=-1, IJ=-JI=K, JK=-KJ=I 


9 


(3.1) 
and KI=—-IkK=J 


We should verify associativity “by hand”: for example, (IJ)K = K? =-1 
and I(JK) = I? = —1. To spare the 24 necessary verifications, we could 
also define H as the subalgebra of 2 x 2 complex matrices or 4 x 4 real 
matrices (associativity is immediate in this case, but one needs to check 
that these matrices satisfy formulas (3.1)). We could also define 


a,gech 
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or also 
a —-b ec -d 
b 0 6aflCU—d Oe 
H = Sess a,b,c,dER 
—-d c —b 
with 
1 \e00O 0 -1 0 0 
0100 1 0 0 0 
P94. O12 |e. Oe || 
0001 Ge Oe 20 
0 “Oe 2-0 oO: 0 4 
Oe aie i Oi. ST 0 
PN oy Gog |e |o 4. iO Ha 
0 SIV aw 6 10 0 O 


1.4. Remark. The construction of H endows it with the structure of an R 
algebra generated by two elements 7 and j, with the relations i? = j? = —1 
and 77 = —jt. To see this, we let k := 77 and deduce the rest from the 
multiplication table since k? = ijij = —iijj landik =1wj =-j = 
(ii)7 = —iji = —ki, etc. The fact that H is noncommutative is already 
given in the multiplication table. 


The conjugate of a quaternion z = al + b1 +cJ+dK is defined by Z = 
al —bI —cJ — dK, its reduced trace by Tr(z) = z+ Z and its reduced norm 
by N(z) = 22 (from now on simply referred to as the trace and the norm). 


1.5. Lemma. If z,w € H, then z+w = 27+ WwW and ZW = W-Z, and 
ifz=al+bl+cJ4+dK, then N(z) = 22 = Zz = (+0? +C4d?*)1 
and Tr(z) = 2a1. Furthermore, Tr(z + 2’) = Tr(z) + Tr(2’), N(z2’) = 
N(z) N(2’), and z is a root of the polynomial X? — Tr(z)X +N(z) € R[X]. 
Proof. These formulas can be checked by direct calculation (left to the 


reader). Take note that the conjugation is an anti-isomorphism of rings, 
i.e., it reverses the order of multiplication. 


It follows that H is a division ring, since if z = al+bIl+cJ+dK isa 

non-zero quaternion, then N(z) := a?+06?+c?+d? € R* and zZ/N(z) = 1, 

hence z~! = z/N(z). 

We will now introduce the ring Z[?] (of Gaussian integers) and the two rings 
Ap =ZA+ZI+ZI+ZK and A= Ay+Z(*444I44 ). 


The set A is a subring of H, because if we let 6 := (l+1+J+K)/2, we 
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have 6? = 6-1 and I5 = 6—1-—T, etc. The elements of the ring A are 
called Hurwitz quaternions. 

It is clear that @ = {N(z) | z € Z[i]} and @ = {N(z) | z € Ao}. In fact, we 
also have @, = {N(z) | z € A}, since if we assume the following elementary 
lemma, then we know that N(a#1+yl+2J+tK) €N ifa,y,z,t € Z+1/2. 


tl+yl+z2J+tk 
2 


1.6. Lemma. Let a = A, where x,y,z and t are 


IaJ 
2 


such that ea is in Ao 


odd integers. Then there exists € = 
and N(a) = N(eq). 
Proof. We write « = 4x’ +e, y = 4y +e, 2 = 42’ +63, t = 4 + 4, 


ey 1 — €21 3 eak , then N(e) = 1, hence 


with e€; = +1. If we set € := 


N(ae) = N(q), and therefore 


Ityl+7J+UK 
ac=a(4 ty e = )e+nto 
=(e/1+y'I+2I+tK) (2e—)+1€ Apo. 


The following lemma will also be useful. 


1.7. Lemma. In the rings Z[i], Ao and A, an element is invertible if and 
only if its norm is 1. 


Proof. If a is invertible, then 1 = N(aa~') = N(a) N(a~+), hence N(a) = 
1. Conversely, if N(a) = 1, then a@ = 1. Since the rings that we are 
looking at are stable under conjugation, then @ is an element of the ring, 
and a is therefore invertible. 


Finally, since the norm is multiplicative, it is enough to show that every 
prime number p (resp. every prime number = 1 mod 4) is the norm of a 
Hurwitz quaternion (resp. the norm of a Gaussian integer). Since 2 = 
12 + 17, it moreover suffices to show this for odd primes p. To do this, 
we will first prove that Z[é] is a principal ring and that A is left (or right) 
principal. 


1.8. Proposition. The ring Z[i] is Euclidean, hence principal. The ring 
A is left Euclidean, hence left principal (and also right Euclidean and right 
principal). 


Proof. We will use the symbol B for both of the rings A and Z[i]. The 
statement means that for a € B and 3 € B \ {0}, there exist g,r € B such 
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that a = q@ +r with N(r) < N(G) (when the ring is A, pay attention to 
the order of multiplication). Once this has been proven, we immediately 
have that Z[i] is principal. Actually, the “same” proof shows that A is 
(left) principal. Now let I be a non-zero left ideal of A (ie, A- IC I), 
so it contains an element @ 4 0 of minimal norm, and clearly AG C I. 
Conversely, let a € I, and write a = qG+4+ 7 with N(r) < N(G). We 
therefore have r = a — qg@ € I, hence r is zero and J = AG. Let us now 
prove that A and Z[?] are Euclidean. The proof is based on the following 
elementary lemma, whose proof is left to the reader. 


1.9. Lemma. Letx eR. Then there exists m € Z such that |x—m| < 1/2, 
and there exists n € Z such that |x —n/2| < 1/4. 


e Now let a € Z[i] and @ € Zit] \ {0}, hence a/G = x + iy € QJIi], and 
there exist m,n € Z such that |x —m| < 1/2 and |y—n| < 1/2. Therefore 


N((-+ iy) —(m-+in)) = (-m)?+(y-n)? <F4+4=4- 


The Gaussian integer g:= m+ ni is the quotient obtained from the (obvi- 
ously possible) division of a by @ since 
N(8) 


N(a— 8B) < 7 < N(8). 


e Ifa € Aand @ € A\ {0}, then oG! = xt+ylt+2J+tK € H 
and there exists m € Z such that |~ — m/2| < 1/4. We therefore choose 
q=(m4+nI+hJ+lK)/2, where m,n,h and £ are integers with the same 
parity (and so that q € A) and such that |y — n/2|, |z — h/2| and |t — ¢/2| 
are < 1/2. We therefore obtain 


Naot —a)=(2-F) +(v-F) +(2-$) + (3) 


and hence the desired inequality, 


N(a — g@) < N(6). 


We can now complete the proof of the two theorems. 


Proof. (Sum of two squares.) The ring Z[?] is principal, hence factorial 
(a factorial ring is also called a unique factorization domain or UFD). It 
is also clear that Z[¢]* = {+1,-+7}. Now let p = 1mod4. We know that 
there exists a € Z such that a? = —1modp. Thus we have an equality 
of the form (a + i)(a — 7%) = pm. We can see that the Gaussian integer p 
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clearly divides neither (a+ i) nor (a —%). It is therefore neither prime, nor 
irreducible since Z[i] is principal. We thus have the decomposition p = a3 
where a and are non-invertible. Therefore, N(a3) = N(p) = p? and 
N(a@) = N(8) = p, which proves the two-square theorem. 


(Sum of four squares.) It is enough to show that if p is an odd prime 
number, then it is the norm of an element of A. The number of squares in 
Z/pZ is (p+1)/2, and therefore the polynomial —1— X? equals a square for 
at least one X; in other words, there exist a,b € Z such that a?+b?+1 € pZ. 
We see from this that (1+aI+bJ)(1—alI—bJ) € pA. We therefore consider 
the (left) ideal .% generated by p and 1+aI+6J. On the one hand, we 
know ¥% = AG because A is (left) principal, and on the other hand, we 
have the inclusions pA = Ap Cc ¥ C A. Thus p = af. Now we will 
check that @ and a are not invertible and also that the above inclusions 
of rings are strict. If a@ were invertible, then p would divide 1+ al + bJ, 
and furthermore (1+ aI+ bJ) = p(a+yl+2zJ+tkK)/2, so that px = 2, 
which is impossible (p is an odd prime). If 3 were invertible, we would have 
JSF =A, hence 1 = q(1+al+bJ) +p, and by multiplying (on the right) 
by (l1—alI—bJ), we would get (1—alI—bJ) = q"p, which is equally absurd. 
We can therefore conclude that N(p) = N(a@) N(3) = p?, where N(a) and 
N() are different from 1, hence equal to p. 


Further down, we will give another proof of the two-square (resp. four- 
square) theorem, which uses the geometry of numbers. 


2. Fermat’s Equation (n = 3 and 4) 


One of the most famous mathematical problems (called “Fermat’s last the- 
orem”) was solved by Andrew Wiles [80], with the help of Taylor, in 1995: 


2.1. Theorem. Let n > 3, and let x,y and z be integers such that 
a +y" =2". Then ryz = 0. 


Of course it is “enough” to prove the theorem for n = 4 and n = p, an odd 
prime. We will settle for proving it for n = 3 and 4, by using Fermat’s prin- 
ciple of infinite descent. The proof proposed for n = 4 stays in Z, but the 
one that we give for n = 3 takes place in Z[j] (with 7 = exp(27i/3)). The 
classical approach, due to Kummer, is based on the following factorization. 
Let ¢ = exp(27i/p), so in the ring Z[¢] we have: 


oP + yP = (a+ y)(e+ Cy) (e+ CP ty) = 2. 
We will do some calculations in the ring Z[¢], setting \ = 1 — ¢. 
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2.2. Lemma. The element d is prime and Z[¢|/AZ[¢] = Fp. Furthermore, 
we have the decomposition 
p-1 
p= [Ja-=0er) where € € Z|¢]*. 
k=1 
The elements 1, := sin(kr/p)/sin(a/p) and ex := (1—¢*)/(1—¢) are units 
in Z[¢] forl <k<p—1, and ne/ex is a root of unity. 


Proof. We will begin by eae the pth cyclotomic polynomial (over 
C), &,(X) = X°-14 Xe-24....4 X41 = [P(X —C*). From this, 
we get the formula p = ®,(1) = []?2;(1 — ¢*). Thus ) divides p and 
p € AZ[¢]. Moreover, since ¢ = 1 mod 4, every element of Z[¢] is congruent 
modulo A to an integer between 0 and p — 1 (inclusive), which proves that 
Z[¢]/AZ[¢] & Fp. Since 1—¢* = (1—C)(1+--- + ¢*-), we see that e is 
in Z[¢]. By the same reasoning and by using the inverse h of k modulo p 
and the equality 1—¢ = (1—¢*)(1+---+¢%@-)), we see that €; + is also 
an integer and therefore that «, € Z[¢]*. Furthermore, if & is odd, then 


_ ce = erilk—1)/p ettk/p 2 oe ttk/p 


€k = 


1—¢ eTt/P _ e—Tt/P 
A>! sin(ak/p ea 
= C2 pel sED) =¢€ 2 mp, 
sin(m/p) 
whereas if k is even, then ex, = —CKey_p. Finally, if 1 — ¢* = eA, then 


p = \?—| where € = €1---€p_1 € Z[C]*. 


Remark. We could of course write other formulas which produce units such 
as: 

aS 
ce = =¢1 exes . 
We will now return to Kummer’s method for Fermat’s equation in its fac- 
tored form (where x, y, z are relatively prime in Z): 


(c+ y)(x+Cy)...(e+¢P ty) = 2. 


Let 6 € Z[¢] be a number which divides two factors of the above equation, 
for example x + ¢’y and 2+ ¢/y; then it divides (¢' — ¢/)y and (¢* — ¢4)a, 
hence (¢* — ¢/), and therefore 6 divides \, so 6 = 1 or \ (up to a unit). If z 
is not divisible by p, then the factors are relatively prime, and if we show 
that Z[¢]| is factorial, we can deduce that: 


2cos ( 27 z) CHC = Cele 


fori=0,....p-1, 2+C'y=uja?, where u; is a unit and a; € Z[C). 
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If z is divisible by p, we also have, still assuming that Z[¢] is factorial, 
some similar identities with extra powers of A. However, this approach is 
hindered by the fact that actually the ring Z[¢] is not factorial in general. 
In fact, if mn = p is prime, it is not factorial whenever p > 23. We should 
therefore try to find a substitute for the following lemma (where the proof 
is left as an exercise). 


2.3. Lemma. Let A be a factorial ring. If the elements ay,...,am € A 
are pairwise relatively prime and a,...dm = a?, then, up to a unit, the a; 
are pth powers. 


We will start by describing the solutions of Fermat’s equation for n = 2. 


2.4. Proposition. Let x,y,z be (relatively prime) integers such that 
x+y? = 27, then (up to switching x and y) there exist (relatively prime) 
integers u and v such that 


=u? —y’, y = 2uv and z= +". (3.2) 
Proof. After having simplified by their gcd, we can assume that x,y,z are 
pairwise relatively prime. Notice that (u? — v2)? + (2uv)? = (u? + v?)?. 
By considering congruences modulo 4, we know that z is odd and that x 
and y have different parity; we will therefore assume that x is odd and y 
is even. We write y? = 27 — 2? = (z—2)(z +2). Now notice that if d 
divides z—x and z+ , then it divides 2x and 2z and therefore also 2 (since 
x and z are relatively prime). Thus gcd(z — #,z +x) = 2. The integers 
(z — x)/2 and (z+ x)/2 being relatively prime and there product being a 
square, are themselves squares which gives us: z— x = 2u7, z+ a = 2u? 
and y = 2uv, and hence x = u? — v? and z = u? + v?, as in the statement 
of the proposition. 


2.5. Theorem. The equation x* + y* = z? does not have any integer 
solutions, except for xyz = 0. Consequently Fermat’s equation for n = 4 
does not have any nontrivial solutions. 


Proof. The main idea of the proof is Fermat’s “infinite descent”, which 
consists of proving that if the equation has a solution (x, y, z) where xyz 4 
0, then it has another solution (21, y1, 21) where 21y;21 # 0 and |z| < |z|. 
This will lead to a contradiction, because a decreasing sequence of positive 
integers is necessarily constant after a certain point. 


So let (x,y,z) be a solution. We can assume that x,y and z are relatively 
prime. By the previous proposition, we know that x? = u? — v?, y? = 2uv 
and z = u? + v?, where u and v are relatively prime. We see that u and v 
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have different parity, and hence u is odd, so v = 2w (if not we would have 
x? = —1mod4). By considering y? = 4uw, we see that u and w, which are 
relatively prime, need to be squares, u = z? and w = a?. Furthermore, by 
again applying the previous proposition to 27+? = u?, we have x = b?—c’?, 
v = 2be and u = 6? + c? where b and € are relatively prime. However, recall 
that v = 2w = 2a”, so we can see, as before, that b and c are squares, 
b= 27 and c= yj. It therefore follows that 


2 2, 2 4.4 
qa=u=Ph+C=2,4+ 9, 


and we can check that |z1| < |z| by observing, for example, that z = 
wu? +0? = 2¢4+4a4 > 2. 


2.6. Theorem. The equation x? + y® = z° does not have any solutions, 
except for xyz = 0. More generally, there do not exist any algebraic integers 
x,y,z € Zi] such that 2° + y? = 2° and xyz #0. 


Proof. It will be convenient to distinguish between the two cases, the easy 
one being when xyz does not have a factor of 3 and the more difficult one 
when, for example, z has a factor of 3. The idea of the proof in the second 
case is to show that if the equation has a solution, then it would have 
another “smaller one” (the principle of “infinite descent”). 

We can show, as in Proposition 3-1.8 for Z[2], that the ring A := Z|] is prin- 
cipal, hence factorial and that the group of units is formed of +1,+7,+j?. 
In particular, we check directly! that if u € A* and u = +1 mod 4?, then 
u = +1 (also recall that \ designates the prime element 1 — 7 and that 
ord,(u) designates the largest exponent such that \°(“) divides wu). 


2.7. Lemma. If x € Z[j] is not divisible by A, then x? = +1 mod‘. 


Proof. We can assume x = 1mod 4. or moreover that « = 1+ Aa. Then, 
x —1= (2—1)(x—j)(x—j?) = a(a+1)(a+14+ 3) = 0mod »', because 
the elements 0,1,1-+ 7 are distinct modulo \ and therefore constitute all 
of the elements of Z[j]/AZ[j]. 


We will now return to the proof of the theorem. 


First case: \ does not divide xyz. By the preceding lemma, we have 
x* = +1mod 4 (and the same holds for y and z), and therefore a solution 
to Fermat’s equation implies that +1+1+1 = 0mod 4’; such a congruence 
is obviously impossible (3 is only divisible by \7). 


1This remark is a very special case of the famous “Kummer lemma”, which says that 
a unit which is congruent modulo X? to a pth power is in fact the pth power of a unit 
in Z[exp(27i/p)], given that p is “regular” in the sense of Remark 3-4.25 (in particular, 
when the ring Z[exp(27i/p)] is factorial, which is the case for p = 3). 
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Second case: » divides xyz. We can assume that A divides z and not 
ay. We will show a slightly more general version, namely that the equation 


g +y? = uz, (3.3) 


where wu is a unit (ie., u € Z[j]*) and m := ord, z > 0, does not have a 
solution in Z[j]. Observe that \? must divide z since +1+1 = uz? mod 4, 
hence z* = 0mod A’, and therefore ord,(z) > 4/3. We will therefore prove 
the descent statement: 


if 2? + y? = uz? where x,y,z € A, u € A* and ord)(z) =m > 2, 
then there exist 71, 41,21 € A and wu’ € A* where ord)(z1) = m—1 and 
28 yf = uz. 
We will of course begin by factoring: 
(x + y)(@ + jy)(a + 57y) = uz". 
We can see that A? must divide one of the factors on the left (because 
ord,(z?) = 3m > 6), say x+y, and therefore ord) (x+jy) = ord) (x+j?y) = 
1; for example «+ jy = a+ y-— Ay and A does not divide y. Thus the gcd 
of two of the factors is exactly 4. Since A is factorial, we see that 


ety = u,X3\3m—-2 
x+jy = wY?r where gcd(X, Y, Z) = 1 and wu}, v2, u3 are units. 
e+ j*y = ugZ?r 


By multiplying the equations respectively by 1, 7 and j? and adding them, 
we obtain 0 = uy X3A9™~? + ugjV3\ + uzj?Z3A. By simplifying by A and 
letting wa := jug/u2 and us := —j7u;/u2, we obtain 


Y? + usZ? = us (es 


We finish by pointing out that +1 + us, = 0mod \?, and therefore ug = +1. 
We then let oe =Y,y = uaZ, 4% = X"'X and wu’ = us so that we have 
x? + y? = u'2} and ace m—1. 
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In this section, we will always assume that d > 0, and we will discuss the 
solutions of the above equation by explaining how it is related to the units 
of the ring Z|V dl and “good” rational approximations of Vad. 


Let us point out that the equation always has as solutions (x, y) = (41,0); 
we will refer to these as trivial. We also point out that if d is a square, 
d= a’, then (x — ay)(x + ay) = 1 implies x + ay = x — ay = 1 (or = —1), 
hence 2ay = 0, and therefore there are no nontrivial solutions. The only 
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interesting case is when d is not the square of an integer (hence Vd ¢ Q). 
The main theorem can be stated as follows. 


3.1. Theorem. Let d be a positive integer which is not a square. Then 
there exists a nontrivial solution (11, y1) € N* x N®* (called the fundamental 
solution) of the equation x? —dy? = 1 such that all positive integer solutions 
are given by (an, Yn) where an + ynVd == (a, + yivd)” and the general 
solutions are given by (tan, +Yn). 


We can of course find solutions (2, y,) by induction starting with (21, y1) 
and observing that 


(tins Yn+1) = (11 2n, + dyiYn; Yrn + L1Yn)- 


The connection to rational approximations of Vd is the following. Suppose 
that (x,y) is a nontrivial solution of the equation (where say x,y > 0), 


then ; i 
—_Vd= < 
mgr Nar 


Conversely, if «/y € Q is an approximation which satisfies the previous 
inequality, then 


0<2-dpP=" (4 va) (4 va) < 1 (avr E) <a 


hence x? — dy? = 1 (because it is an integer). Thus a positive solution 


(x, y) of Pell’s equation corresponds to a rational approximation «/y of Vd 
Z : me 1 
which satisfies 0 << = — Vd < 
y 2V dy? 


To the ring Z[Vd], we introduce the homomorphism o(a+bVd) = a—bVd 
(why is it a homomorphism?), as well as the norm 


N(a) = ao(a) = a? — db’, if a=a+ dvd. 
The norm is multiplicative, and we have, as in Z[?], the following lemma, 


whose very similar proof is omitted. 


3.2. Lemma. In the ring Z[Vd], an element is invertible if and only if its 
norm is +1. 


If we denote by A* = Z[Vd]|* and U, = {a | N(a) = 1}, we see that the 
index (A* : U;) is either 2 or 1, depending on whether there exists a unit 
with norm —1. Of course, the solutions (x, y) of Pell’s equation correspond 
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to the units x + yVd € Uj, and Theorem 3-3.1 can be translated into the 
following statement. 


3.3. Theorem. There exists a unit «, € Z[Vd|*, called the fundamental 
unit, such that 
Z[Vd]* = {te | n € Z} & {41} x Z. 
If N(e1) = +1, then U, = Z[Vd]* and if N(e1) = —1, then we have 
U, = {te7” |n eZ} = {41} x Z. 


lan 


In order to prove this theorem, we introduce the “logarithm” map, L : 
Z[Vd|* — R? given by the formula L(q@) = (log lal, log |a(a)|). 


3.4. Proposition. The map L : Z[Vd|* — R? has the following proper- 
tues. 
i) The map L is a homomorphism, i.e., L(aZ) = L(a) + L(f). 
ti) Its kernel is +1. 
iit) Its image is a discrete subgroup. 


iv) Its image contains the line x+y = 0. 


Proof. Property i) is immediate. Property iv) comes from the fact that 
log |a| + log |a(a)| = log|N(a)| = 0. To prove zi) and iti), we will show 
that the preimage under L of a ball in R? is finite, from which can we 
deduce, on the one hand, that the image is discrete and, on the other hand, 
that the kernel of FL is finite, and therefore composed of roots of unity 
hence of +1 since Z[Vd] C R. Now, an element a € Z[Vd]* is a root of 
P := X? —t(a)X + N(a) € Z[X], with t(a) = a+o(a) (the “trace”) and 
N(a) = +1. If L(q) is in a ball of radius C, we have |a| = exp(log lal) < 
exp(C) and the same for |a(a)|. It follows that |t(@)| < 2exp(C). Therefore 
there are only a finite number of possible polynomials, and hence a finite 
number of a. 


We will now state a classical lemma. 


3.5. Lemma. Every discrete subgroup G of R is of the form G = Zw. 


Proof. (Sketch) If G = {0}, we can choose w = 0, and if not, we choose 
w := inf{t € G| ax > O}. Since G is discrete, we have w > 0 andw €G 
(otherwise, there would be a sequence of elements of G which converge to 
w, which contradicts the fact that G is discrete). Finally, if « € G, we 
choose m € Z such that mw < a < (m+1)w. Therefore, 0 < x — mw <w 
and « — mw € G, hence + = mw. 
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This lemma can be applied to L(Z[Vd]*), and provides a proof of the 
theorem under the condition that we prove the existence of a unit 4 +1 or 
of a nontrivial solution to Pell’s equation. These considerations show that 
it suffices to prove the following proposition. 


3.6. Proposition. Let d be a positive integer which is not a square. Then 

there exists a nontrivial solution (%1,y1) (i.e., with y, #0) to the equation 
2 2 

a —dy* =1. 


A good practical method for constructing this solution is the method of 
continued fractions, which is succinctly described later in this section. We 
are first going to prove the existence of a solution by showing, with an ar- 
gument due to Dirichlet (and already used in the proof of Theorem 2-4.4) 
called the “pigeonhole principle”, that there exist good rational approxima- 
tions of Vd, without actually explicitly constructing them, then give the 
continued fractions algorithm. 


3.7. Lemma. Leta € Rand N >1. Then there exists a rational number 
p/q € Q such that 

| < ag and Ll<q<QN. 

Proof. We cut the interval [0, 1] into N intervals of length 1/N. Among the 
N-+1 numbers ja —|ja| (for 7 = 0,..., N), there are therefore two in the 
same small interval and at a distance of at most 1/N from each other. In 
other words, there exist 0< 7 <@<.N such that |(ja—|ja])—(€a—|la})| < 
1/N. It follows that 


a|-Liol} 
cj (€— j)N 
The desired result follows by setting p:= |fa| — |ja] and q:= ¢—j. 


Let us point out that the approximation provided by the lemma satisfies 
ja — p/q| < 1/4’. 


3.8. Corollary. (Dirichlet) Let ae R\Q. Then there exist infinitely 
many rational numbers p/q € Q such that 


$l <o 
q 


Proof. Let Ny > 1 and p;/q: be a rational number provided by the previous 


lemma such that la age. < _ Since a ¢ Q, the left-hand side of 
71 aM 
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the inequality is non-zero. Therefore we can choose Nz such that 1/N2 < 


la - a . Now let po/g2 be a rational number provided by the previous 
lemma such that la 3 EF < ! . It follows that 
q2 qaN2 
P2 1 1 | Pi 
< < < ‘ 
a q2 | q2No No 1 


therefore po/q2 # pi/qi. It is now clear that we can iterate this process 
indefinitely. 


3.9. Remarks. 1) If we remove the hypothesis that a ¢ Q in the state- 
ment of the corollary, the result would be false. To see this, if a = a/b and 


Pp 1 
a/b # p/q where Jo - 7 < @ then 


1 
a 


and hence q < b. There would therefore only exist a finite number of p/q. 


Z |aq — bp| 
bq bq 


p 
=|a-2/< 
a 7 | 


2) Let us look at the example a = Vd where d is not a square. We can 
prove that the corollary is optimal in the following sense: there exists a 
constant C' > 0 such that for every p/q € Q, we have 


Ma B]>S 


To do this, consider P(X) = X? —d = (X — Vd)(X + Vd). It follows that 
|P(p/q)| > 1/¢?. Now, if for example |V/d—p/q| < 1, we have |p/q| < Vd+1, 
then |p/q¢+ Vd| < < Wad +1, and thus 
IVa Pp P| IPD 5 1 
Ip/a+ val (2Vd + 1)q? 


3) If a is 7 algebraic number of degree d > 3, the same proof shows 
that lo 2 —— et (Liouville’s inequality). In 1955, Roth proved-but the 


proof is oan more difficult-that furthermore, for every € > 0 there exists 
a constant C’, which depends on a and e, such that for every p/q € Q (see 
Chap. 6): 


Pls C.. 
q go 
Proof. (of Proposition 3-3.6) We will apply Corollary 3-3.8 to Vd ¢ Q. 


Thus there are infinitely many integers (a, y) such that |/d—a/y| < 1/y? 
and hence such that |\/d+a/y| <2Vd+1 and finally |x? — dy?| < 2Vd+1. 
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In particular, there exists an integer c such that there are infinitely many 
solutions to the equation x? — dy? = c. Since there are only a finite number 
of classes modulo c, there even exist infinitely many pairwise congruent 
solutions modulo c. So we take (#1, y,) and (x2, y2) which are solutions to 


xe? dy” =c and check that x; = x2.modc and y; = yzmodc. We set 


utwa-e Dtuvd 
to +yovd 


We therefore have 


(eat 2 ana = N(21 + yiVd) Sa: 
a N(x2 + y2vd) 


and it suffices to see that u and v are integers. Therefore, we compute 


nee ce a yovd) _ £1X2 = aye pap Va. 
T2 — GY 


and notice that 2,22 — dy1y2 = ae dy? = Omodce and y\ 22 — r1y2 = 
yit1 — £1y, =Omodc, which finishes the proof. 


Supplement. The slightly more general equation x? — dy? = m does not 


always have a solution. For example, if m = —1 and p is a prime number 
congruent to 3 modulo 4 which divides d, then a solution would imply that 
x? = —1modyp, which is impossible. More generally, for every odd p which 
divides d but not m, it must be that x? = m mod p, and hence (+) = 1, 
Conversely, if there existed a solution, there would exist infinitely many of 
them, since N(ua) = m if N(a) = m and N(u) = 1. We have following 
proposition, which could be useful. 


3.10. Proposition. Let m € Z\ {0}, and let d be an integer which is not 
a square. Then there exist ay,...a, € Z[Vd] such that: 


{a € Z[Vd] | N(a) = m} = a,U,U---Ua,Uj. 


Proof. It is clear that the set of solutions is the union of classes mod- 
ulo U;. We will show that there exists a finite union. If N(a) = m, it 
follows that a divides m and furthermore that mZ[Vd] C aZ[Vd]. But 
the set of ideals which contain mZ[Vd] is in bijection with the ideals of 
the quotient Z[Vd]/mZ[Vd| and is consequently a finite set. However, 
aZ|Vd| = a'Z[V/d] is equivalent to the fact that a and a’ are equal up to 
a unit. The set of solutions is thus finite modulo the group of units, hence 
equal modulo the subgroup Uj. 
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Continued fractions. We will now outline a procedure for calculating 
good rational approximations of a real number: the algorithm of continued 
fractions. 


Notation. Let ao be a real number and aj,...,a@, be a sequence of real 
numbers > 0. We set 


[a9,@1,---, Gn] <= Gg + 


+ “Gn, 
3.11. Definition. To a real number z, we can associate a sequence of 
integers a, and an auxiliary sequence of real numbers 2,, defined as follows: 
ao = |@], To := @, Iny1 = 1/(4n — Gy) and ani = |4n41]. We conclude 
the sequence when 2, is an integer (which only happens when x € Q). We 
define the nth convergent as 
dite, 


on (a9, @1,---, Qn]. 


3.12. Lemma. The following formulas hold. 


t) x = [a0,Q1,.--,An—1, Ln]. 
ti) Pntt = GntiPn + Pn—1 (where po = ag and py = a,ao + 1), while 
Qn+1 = An419n + Qn—-1 (where qo = 1 and qi = a1). 
tit) If pn/dn = [ao,---,@n], then 


Pn¥ T Pn-1 . 


QaQ,-++5,4n, = 
| 0 e y QnY + Qn-1 


WV) QnPn—1 — PnQn—1 = (—1)”. 
v) QnPn—2 — Pndn-2 = (—1)"~"1an. 
Proof. Let us point out right away that for all real numbers a;, we have 
[ao,---;@n—1, An] = [a0,---,@n—-1 +1/a,]. The first formula can be proven 
by induction (the case n = 0 is satisfied by construction). Assume therefore 
that x = [do,..-,@n—1,2n], hence [ao,...,@n,%n4i] = [Go,---,@n—1,An + 
1/an4i] = [ao,---,;@n-1,0n] = x. Next, 

p 

[@0,---,@n—1;@n; @n41] = [@0,---,@n—1,0n + 1/an4i1] = =, 
n 

where we can assume (by induction) that the p/,,q/, are given by the formu- 
las Dipti = @n41Pm +Pm—1> Where af, = dm form <n—1 and al, =a,+ 
1/an41. Thus p), = (an +1/an41)Pp—1 + Pa—o = (@n+1/Gn41)Pn—1 +Pn—2 
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and dq, = (an + 1/Qn41)G,-1 + q,-9 a (Gn + 1/@n41)Qn-1 ale dn—-2; hence 


Pn (an am 1/Gn41)Pn—1 TPn-2 _ Gn41(GnPny a Pn—2) Tr Pn-1 
In (Gy + 1/an41)@n—1 + Gn—2 Gini (Gndn, + On—2) + Gn—1 
Gn+1Pn T Pn-1 

An+19n T In-1 


Formula iii) can be proven similarly to the previous formulas. Formulas 
iv) and (v) can also be proven by induction, for example 


Pn419n — 9n4+1Pn = (Qn4+1Pn + Pn—1)4n = (Gn419n ar Qn—1)Pn 


— (PnGn—1 — InPn—1); 


and the same for (v). 


3.13. Remark. We can also take as our initial values of the sequences 
(py) and (qn) the values p_2 = 0, p_; = 1 and q_2 = 1, g-1 = 0. Moreover, 
it often helpful to write the formulas in matrix form, for example: 


Pn Pn-1 _ ao 1 wn an 1 
Qn dn-1 — 1 O 1 O/}° 
3.14. Remark. These induction formulas allow us to calculate p,, and qn, 


starting with the computation of the a,; since a, > 1, we also see that gy 
grows at least as fast as a Fibonacci sequence and that the following lower 


n—1 
bound holds: qn > (+4") . Thus an approximation |x — p/q| < 


1/2q?, which we will show below must be a convergent of the continued 
fraction of z, can be computed in O(log q) steps; this remark is used in 
Exercise 3-6.12. 


The following theorem can also be deduced from these formulas. 


3.15. Theorem. The sequence (pn/qdn) converges tox. More precisely, the 
sequence of the pon/qan is increasing and converges to x and the sequence 
Ponti/Gan+1 ts decreasing and converges tox. The following approximation 


holds: 
1 


Gn Qn+1 


1 | Pn 
Qn( Qn ae Qn+1) In 


Furthermore, the convergents give the best approximations of x, in the fol- 
lowing sense. If q < dn and p/q # Pn/dn, then 


Pn 


oo 
dn 


In 


ea 
q\t— 5]. 


§3. Pell’s Equation x? — dy? = 1 93 


Furthermore, if le - a < 1/2q?, then there exists n such that p/q = 
PnlIn- 


Proof. We know that an = [tn] < Ln. Now, the function [ao,...,@n] is 
clearly an increasing (resp. decreasing) function of a,,, for m even (resp. 
m odd). Therefore, if n is even, then [ag,...,@n] < [ag,---, 2] = 2, and 


the converse if n is odd. By the lemma, we know 


Pn-1 Pro _ (—1)” Pn-2 Pn (—1)""*an : 


dn-1 dn a dndn-1 and dn-2 dn ~ Gn dn-2 
Hence we have the ordering 
Pan P2n+2 P2n4+1 P2n-1 
G2n G2n+2 G2n+1 qan—1 ’ 


and therefore |2@ — pn/dn| < |Pn/@n — Pnti/dn+1| = 1/dndn+1, whereas |x — 
Pn/Qn\ > |Pn/Gn—Pn+2/Gn+2| = An42/Gndn42 = An+2/dn(An424n41+n) 
1/dn(Qn41 + Gn). These approximations clearly show that the sequence 
(Pn/dn) converges to x. Observe also that 1/gn+42 < [pn — edn| < 1/dn41 
and that the sequence (|p, — xgq,|) is therefore strictly decreasing. Now 
let p/q be a fraction with q < qn and p/¢ 4 Pn/dn. We can assume that 
dn—1 <q. If we solve the system of linear equations up, + vpy,_1 = p and 
Udn +Ugn—1 = q, then we obtain u = +(pqn —QPn—1) and v = +(pdn — pn). 
In particular, u and v are non-zero integers. Since q = udn+vqdn—1 < dn; we 
see that u and v have opposite signs and that the two quantities u(pp—qn2) 
and v(Pn—1 — Gn—12) therefore have the same sign. We know that p— qa = 
U(Pn — QnX) + U(Pn—1 — Qn—12), hence 


|p — gx| = |u(Pn — Qn&)| + |U(Pn—1 — Gn—12)| > [Pn — Gnt| + [Pn—1 — In—12I- 


Finally, if | — p/q| < 1/2q?, we set x — p/q = €6/q? where € = +1 and 0 < 
0 < 1/2. We will expand p/q = [ao,...,@m] as a finite continued fraction. 
By noticing that if a, > 1, we see that [ag,..-,@m] = [ao,---,@m — 1,1] 
and hence that we can choose? the parity of m. We choose the parity in 
such a way that pm—1q — Pd@m—1 = (—1)™ = «. We now will define y by the 
equality « = (ypm + Pm—1)/(Y@m + Gm-1). Solving explicitly for y yields 
y = (q— 9dm_1)/0q¢. By using gn_1 < q and @ < 1/2, we see that y > 1, 


and we can therefore write y = [am41,...], where G41 > 1. By expanding 
the obtained continued fraction x = [ao,...,@m,@m+41,---], we have that 
p/q = |ao0,-.--;@m] is a convergent. 


3.16. Remarks. 1) Whenever x € Q, its expansion as a continued 
fraction is finite (i.e., there exists n such that a, = 0). 


2Tt can also be shown that this is the only possible ambiguity in the expression of a 
continued fraction. 
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2) When x € R\Q, we therefore have that 2 = limy;_..o[a0,---,@n], which 
by convention is written x = [ao,...,@n,...] and which is referred to as 
the continued fraction expansion of x. 


3) A solution to Pell’s equation p? — dq? = 1 provides, as we have seen, a 
good approximation p/q of Vd. It should therefore appear as a convergent 
of the continued fraction expansion of Vd. This is precisely how we find it, 
and in fact fairly rapidly, considering Remark 3-3.14. 


3.17. Examples. The continued fraction expansion of z = /2 and of 
y = V7 are written respectively as 


/2=[1,2,2,2,...) and W7=([2,1,1,1,4,1,1,1,4,...]. 


It can be verified that these expansions are periodic. In the case of V2, the 
initial convergent po/qo gives pj—2q2 = —1and pi/q = 3/2 gives pj—2q7 = 
+1. In the case of V7, the convergent p3/q3 = 8/3 gives p3—7q? = +1. The 
fact that the continued fraction expansion is periodic is a very special case 
of Lagrange’s theorem which says that the continued fraction expansion of 
the real number z is periodic if and only if x is quadratic, i.e., the root of 
a quadratic equation with integer coefficients (see, for example, Hardy and 
Wright’s book [4]). 

Let us give an example which illustrates the quality of the continued frac- 
tion algorithm: finding solutions of the equation x? — 6ly? = 1 (try to 
find a solution to this one by guess and check!). The continued fraction 
expansion of 7 = /61 is written 


61 = [7,1,4,3,.1,2,2,1,3,4,1,14,1,....], 


and the expansion becomes periodic starting at aj2 = a, = 1. The first 
convergents are 

7 8 39 125 164 453 1070 1523 5639 

1’1’ 5’ 16’ 21’ 58’ 137’ 195 ’ 722 ’ 

24079 29718 440131 469849 

3083 ’ 3805 ’ 56353 ° 60158 
The tenth convergent, pio/qi0 = 29718/3805, provides the first solution to 
x” — 6ly? = —1. The fundamental solution of x? — 61y? = 1 is from then 


on given by 21 + y:V61 = (pio + qioV61)?, or 
(v1, y1) = (1766319049, 226153980). 


We will indicate, without proof (see for example the entertaining article 
[50] which describes, among other things and in detail, Archimedes’ cattle 
problem, whose solution comes from the solution of a Pell equation), the 
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following facts, which can be checked in the previous examples. We know 
that the expansion of Vd can be written as Vd = [ao,...,@p,4r41,.--]. 


i) If r is the first subscript such that a,41 = 2a9, then the expansion of 
Vd becomes periodic starting at the latter coefficient (i.e., the sequence 
@1,.--,@y, 2a repeats). 

ii) If r is odd (which is the case for d = 7), then (p,,q,) provides the 
smallest solution to Pell’s equation x? — dy? = +1 and there are no 
solutions to the equation x? — dy? = —1. 

iii) If r is even (which is the case for d = 2 or 61), then (p,;,q,) provides 
the smallest solution to the equation x? — dy? = —1 and the smallest 
solution to Pell’s equation x2? — dy? = +1 is given by (pap+1, d2r+1)- 


4. Rings of Algebraic Integers 


In this part, we will give you an idea of what some the general properties 
of ring extensions of Z are. These properties will lead us to the notion of 
a Dedekind ring, which in turn generalizes some of the examples that we 
have already encountered with Z[Vd] and Zlexp(27i/n)].. The main tools 
are algebra and some geometry of numbers. 


We are familiar with the notion of an algebraic element (here “algebraic” is 
always taken to be in the sense “algebraic over the rationals”); this notion 
comes from field theory, and the corresponding notion for rings is as follows. 


4.1. Definition. An algebraic integer is a complex number, a, which is 
a root of a monic polynomial with integer coefficients. More generally, an 
element a is called integral or an algebraic integer over a ring A if it is the 
root of a monic polynomial with coefficients in A. 


Example. A rational number a = a/b is the root of bX —a € Z[X] and, by 
making use of the fact that Z is factorial, we see that if @ is an algebraic 
integer, then a € Z. 


4.2. Definition. An integral domain A is integrally closed if the only 
elements of KK := Frac(A) which are algebraic over A are elements of A. 


Examples. We can easily show that a principal or factorial ring is integrally 
closed. However, the ring A = Z[V5] is not integrally closed, since the 


number a := ease which is in Q(/5), is a root X? — X — 1, which is 


integral over A (and even over Z) without being in A. 


96 3. Algebra and Diophantine Equations 


4.3. Lemma. An element a is an algebraic integer over A if and only if 
Ala] is a finitely generated A-module, and also if and only if the ring Ala] 
is contained in a subring containing A and a, which is a finitely generated 


A-module. 


4.4. Corollary. The sum, difference and product of two algebraic integers 
is an algebraic integer. If a is integral over B and every element of B 
is integral over A, then a is integral over A. In particular, if K is an 
extension of Q, then the set: 


Ox :={a€ K | ais an algebraic integer} 
is @ ring. 


Proof. If a is integral over A, then it satisfies an equation a” = an—ya"~!+ 
--» + a9 with a; € A. The A-module Ala] = A+ Aa+---+ Aa"! is 
therefore finitely generated. Conversely, if Ala] is contained in a finitely 
generated A-module, Au,+---+Aum, we can write auj = aie a;,jUj, With 
aij € A. We will therefore let M be the m x m matrix of the coefficients 
a;,;. The polynomial P(X) := det (XJd—M) is monic with coefficients 
in A and P(a) = 0 (think of the Cayley-Hamilton theorem, or redo its 
proof), and hence a is integral over A. For the corollary, observe that if a 
and ( are algebraic integers, then Z[a, 3] is a finitely generated Z-module (a 
generating set is given by a finite number of a*3!), and hence its elements 
are all integral over Z. 


4.5. Definition. A number field is a finite extension K of Q and @x is 
the ring of integers of K. 


4.6. Remark. We can always assume (by the primitive element theorem) 
that there exists an a such that kK = Q(q). It should also be noted that if 
a is algebraic over Q, then there exists an integer d € Z (a “denominator”) 
such that da is an algebraic integer. In particular, K is the field of fractions 
of Cr. 


4.7. Proposition. The ring Ox is integrally closed. 


Proof. An element of K which is integral over (x is integral over Z, this is 
an immediate consequence of Lemma 3-4.3 above. It is therefore in Ox. 


4.8. Definition. Let K be a number field and a € K; we define the norm 
N(a) = NG (a) (resp. the trace Tr(a) = Trg (a)) to be the determinant 
(resp. the trace) of multiplication by a, viewed as a Q-linear map from K 
to K. 
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Afterwards, we will give a more concrete expression for the trace and the 
norm. 


4.9. Lemma. Let a be algebraic over Q and K = Q(a), and let 
P(X) = X44 ag1XO1 +--+ + a0 = (X — a4) +++ (X — a4) 
be the minimal polynomial of a over Q. Then we have 
NG (a) = a1: ++ aa and Trg (a) = ay +--+ +04. 
More generally, if ae K and m= [K : Q(a)], then NG(a) = (ay +++ aq)” 
and Trg (a) = may +--++ aq). 


Proof. We are only going to prove the case K = Q(q) and leave the 
general case as an exercise. It is sufficient to notice that the characteristic 
polynomial of multiplication by a, seen as a Q-linear map from K to Kk, 
is nothing but the minimal polynomial of a. This is easily seen by taking 
the elements 1,a,...,a¢~! as a basis for K over Q. 


4.10. Remark. We can immediately deduce from the previous lemma 
that if a € Ox, then Trg (a) and NG (a) are in Z. This follows from the 
fact that they are in Q and are algebraic integers. 


4.11. Examples. 


1. If K = Q(V4d), where d is square-free, then Ox = Z[Vd] if d = 2 or 
3 mod4, but Cx = Z | us — if d= 1 mod4. This follows from 


the fact that if a € Gx, we can write a = «+ yvd, where a priori 
x,y € Q. We know that the trace and the norm are in Z and actually, 
since a is a root of X? — Tr(a) X + N(a), this is equivalent to a € Ox. 
Now, Tr(a) = 2x and N(a) = 2? — dy”, hence x = a/2, y = b/2, where 
a,b € Z and a? — db? € 4Z. If a is even, then 0 is also even and vice 
versa. If a and 6b are odd, we obtain d = 1mod4, which proves the 
result. 


2. If K = Q(¢), where ¢ =: exp(2mi/p), then Ox = Z[¢|. We have seen 
that AZ[¢] N Z = pZ (recall that A := 1-—¢). Ifa = ap +ai¢+ 
-+++@,—2¢?~? is an algebraic integer, where the a; are a priori rational 
numbers, we can check that Tr(Aa) = pap. Now, Tr(Aq) is in the ideal 
generated by A, but also in Z, so it is therefore an integer multiple of p. 
We can deduce from this that ag € Z. Then we start over again with 
a’ := ¢~!(a@— ag), and we then can conclude that a, € Z, and so on. 
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4.12. Proposition. If |K : Q] =n, then there exist €1,...,€n € O« such 
that Ox = Ze, ®---@ Zen, (as an abelian group or Z-module). 


More generally, if I is a non-zero ideal of Ox, then there exist e4,...,€ 
Ox such that I = Ze, ®--- @ Ze}. 


€ 


/ 
n 


Let us point out that it is not always true that there exists an algebraic 
integer a such that Ox = Z[a]. 


Proof. The Q-bilinear form (x,y) := Tr(zy) from K x K to Q is nonde- 
generate (because if z 4 0, then Tr(aa~+) = [K : Q] 4 0, or see Exercise 
3-6.15). If fi,..., fn are vectors in a basis of kK over Q, we can assume, 
up to multiplication by a common denominator dg € Z, that they are 
in Ox. Let fi,..., fy be a dual basis (ie., such that Tr(fif7) = 4:;) 
and let d be a common denominator of the f7. For « € @x, we can 
therefore write « = xf; +-:- + 2nfn where x; € Q. We know that 
Tr (a(df)) = dTr(xf;) = dz; is in Z, and therefore 


Zf.@--@Lfn C Ox CL (Zhi @- @Zfn), 


which proves the first assertion. 


If I = a@x, then we can choose e, = ae;. If the ideal is not principal 
anymore, then we can nevertheless choose a € J \ {0} such that, for a 
certain d > 1, we have 

Zae, ®:::-PZae, =aGx CIC 06K Cc J Zoey @ ++ @ Zaen). 


The second assertion follows from this. 


4.13. Definition. Let J be a non-zero ideal of Ox. The norm of the ideal 
is defined as 
N(J) := card (@x/I). 


4.14. Proposition. If a € Gx, then 
N(a@x) = ING (a) . 
Furthermore, the norm is multiplicative on ideals: N(IJ) = N(I) N(J). 


Proof. If M is a Z-linear map from Z” to Z” with non-zero determinant, we 
have card(Z" /MZ") = | det(M/)|. If we denote by M(qa) the multiplication 
by a from Ox to Ox, we obtain 


N(a@x) = card (Ox /a6x) = | det(M(a))| = [NS (a)|. 


For the moment, we will settle for proving the second property in two 
special cases: the case where J and J are comazimail, i.e., 1+ J = Ox, and 
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the case where one of the two ideals is principal. The general case is more 
subtle and will be proven after the proof of Theorem 3-4.18 (we leave it to 
the reader to verify that no vicious circle is going on). 


If [+ J = @x, then there exist ig € J and jo € J such that ig + jo = 1. We 
can deduce from this firstly that IN J = IJ because x € INJ can be written 
x = a9 + eJjo and secondly that the ring homomorphism Cx — Ox /I x 
Ox /J, whose kernel is IM J, is surjective because big + ajo = a mod J and 
big +ajo = b mod J. This homomorphism thus induces a ring isomorphism 
Ox [IJ = Ox/I x Ox/J (by the generalized Chinese remainder theorem). 
Thus we have N(I.J) = N(J) N(J). 


Now assume that J = a@x. It follows from the exact sequence (of @x- 
modules or simply of abelian groups), 


03d 4 Celt Ox] SU, 


that N(IJ) = N(J) card(J/IJ). The morphism ¢: Gx — J/IJ given by 
(x) := axmod IJ is surjective, and its kernel is equal to {2 € Gx | ax € 
al} = I. The desired equality N(I) = card(@x/I) = card(J/IJ) follows 
from this. 


Example of a non-principal ring. The ring Z[iV3] is neither principal nor 
factorial, because it is not integrally closed: Z[iV3] is strictly contained in 
Z|(1 + iV3)/2], which is principal, and has the same fraction field, namely 
Q(iV3). More fundamentally, the rings Z[/10] and Z[iV/5] are neither 
principal nor factorial. To see this, notice that 


9=3?=(V10+1)(V10-1) and 6=2-3=(1+iV5)(1—iv5) 


give two essentially different decompositions into products of irreducible 
elements. In fact, we can show directly that the ideal generated by 3 and 
V10 + 1 in Z[/10] (resp. the ideal generated by 2 and iV5 + 1 in Z[iv/5]) 
is not principal, because the quotient by the ideal is Z/3Z (resp. Z/2Z) 
and there is no element of norm 3 in Z[V10] (resp. of norm 2 in Z[iV/5)). 


In order to measure how non-principal a ring is, we can introduce the 
following equivalence relation on ideals. 


4.15. Definition. Two non-zero ideals J and J are equivalent, denoted 
I~ J, if there exist two non-zero elements a, 3 € Ox such that al = BJ. 


The ring @x is principal if and only if there is only one equivalence class. 
We will see that the set of classes is finite (Theorem 3-4.23, below) and 
forms a group, i.e., every ideal is invertible: for every non-zero ideal J, 
there exist a non-zero a € @x and an ideal J such that IJ = a@x. 
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4.16. Definition. A ring Ais a Dedekind ring if it is Noetherian, integrally 
closed and if every non-zero prime ideal is maximal. 


4.17. Examples. The fundamental example of a Dedekind ring is the 
ring of integers, Ox, of a number field. To see this, it is integrally closed 
(cf. Proposition 3-4.7) and, since the quotient by a non-zero ideal is finite, 
the two other conditions can be easily checked. The set of ideals which 
contain a given non-zero ideal is finite, and a finite ring is integral if and 
only if it is a field. 

If k isa field, the ring k[T] is a Dedekind ring since it is principal. More gen- 
erally, rings of the form A = k[X,Y]/(f) = k|a, y] are Dedekind provided 
that they are integrally closed. (be careful: for example, if f = Y? — X°, 
the element a = y/z is integral over A without being in A). 


The fundamental property of Dedekind rings—that which in some sense 
replaces the notion of factoriality—is formulated in the following theorem. 


4.18. Theorem. Every non-zero ideal of Ox can be decomposed as a 
product of prime ideals; furthermore, this decomposition is unique (up to 


the order). 


Proof. We will start by stating some purely algebraic remarks. If 6 « Kk 
and I is an ideal in @x which has the property that G7 Cc I, then Lemma 
3-4.3 shows that (@ is an algebraic integer. If J and J are two ideals such 
that J = IJ, then J = @x. To see this, if a ,...,@,, is basis of I over Z, 
then there exist b;; € J such that a; = »y b;;a;, hence det(b;; — 6;;) = 0, 
and so 1 € J. From this, we can deduce the following assertion: 


if al = JI, then J=a0x. 


To see why this is true, for every G € J we have GI C JI = al, hence 
(Ga~')I Cc I, and therefore Ga! is an algebraic integer, and moreover 
B€ aK. Thus a~!J is an ideal in Gx and a~!JI = I, therefore a7! J = 
Ox and J = a@x. In the following section, we will use results from the 
geometry of numbers to show that there are a finite number of equivalence 
classes of ideals modulo principal ideals. If J is an ideal in @x, then there 
exist m < n such that J” and J” are in the same class and, moreover, 
al™ = BI". We can deduce from this that a@x = GI"—™, and hence 


for every ideal I of Ox, there exist h>1 and y € @x such that Ih=v@x. 
This allows us to prove the “cancellation” property of ideals: 
if [J =IJ'’, then J = J’. 


To see why this is true, by multiplying by J’~1, we obtain yJ = yJ’, 
hence J = J’. We can also show that inclusion of ideals is equivalent to 
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divisibility: 
if I Cc J, then there exists an ideal J’ such that I = JJ’. 


This can be explained by noticing that if J” = G@x, we obtain J’~1I C 
GOK, thus J’ := B-1J"-1T is an ideal in Gx, and JJ’ = @-1J"I =. 


We will now prove the existence of the decomposition of an ideal. Let 
I # Ox. If p; is a maximal ideal which contains J, J C p,, then J = p,h; 
if I) # Ox, we can still write JT = pypoly. In this way, we iteratively 
construct a sequence of prime ideals such that J = p,---p,/,, and since 
Ox is Noetherian, the process must eventually stop, i.e., there exists an n 
such that I, = @x, and hence I = py, --- py. 


We are now going to prove the uniqueness of the decomposition of an ideal. 
In order to do this, we will first point out that the previous results show 
that p™*+ is included in p™ and distinct from p™. Hence we can define 


ord,(I) := max{m>0|ICp™}. 


We can easily check that ord, (JI) is zero for almost every p and that 
I= II porde ). 
p 


This, together with the cancellation property, finishes the proof. 


Now we can move on to the general case of the formula N(IJ) = N(J) N(J). 


Proof. (End of the proof of Proposition 3-4.14) It is enough, by the theorem 
on the decomposition of ideals stated above (Theorem 3-4.18), to show that 
the formula N(IJ) = N(J) N(J) holds when J is a non-zero prime ideal, in 
other words, for J maximal. Since IJ C I, we know that 


card (@x /IJ) = card (@x/I) card (I/IJ) . 


Since J is maximal, k := @x/J is a (finite) field. Since I/IJ is also 
an Ox-module killed by J, we can consider it as a k-module or k-vector 
space. If we show that it has dimension 1, then we have proven that 
card(I/IJ) = card(@x/J) which completes the proof. Now, a k-vector 
subspace {0} C L C I/IJ is also an A-module, and therefore corresponds 
to an ideal I’ such that LD = I'/IJ where IJ c I’ Cc I. This gives us 
I’ =TIJ' where J C J’ C @x; since J is maximal, we must have that 
J' = J or Gx, and hence I’ = IJ or I' = I. Thus we can conclude that 
L={0} or L/LJ. 


4.19. Remark. The statement of the theorem (but not the proof) is also 
true for a general Dedekind ring—see for example P. Samuel’s book Théorie 
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algébrique des nombres |7| (Chap. 3). Another important property, which 
also serves as a definition for a Dedekind ring, is that fractional ideals are 
invertible. In fact, the key idea in the proof given in [7] is to show that 
if J is a maximal ideal in @x and if we set I* := {x € K | «I C @x}, 
then II* = @x. In particular, if we quotient the unitary monoid of ideals 
by the submonoid of prime ideals, we obtain a group, explicitly described 
below on page 104 (a fractional ideal I is a @x-submodule of K such that 
dI C @x for some d € Gx and which is invertible if there exists I’ such 
that II’ = Ox). 


We are now going to give a more thorough description of prime ideals. Let 
us first point out that if p is a non-zero prime ideal in Ox, then pf Z is 
a non-zero prime ideal in Z, hence of the form pZ for some prime number 
p. Thus every prime ideal p can be associated to a p, which is also the 
characteristic of the residue field x /p. Conversely, if p is a prime number, 
then there is no reason that the ideal that it generates in @x should still 
be prime and is therefore written, in light of the theorem above, 


pOw =pi'---pS where p; are distinct prime ideals and e; > 1. 
Let f; = [Gx /p; : Fp], so that Np; = pf, By taking the norms, we obtain 
N(p@K) = p” = Npyt oe “Nps _ parfite tests. 


from which we have the relation 
i=1 


By using the Chinese remainder theorem, we also have that 
Ox /pOK = (Ox/py') x +++ x (Ox/pS). 


Describing the prime ideals in Kk thus boils down to describing the decom- 
position in Ox of primes in Z. 


4.20. Example. (Decomposition of primes in a quadratic field.) In the 
case where K = Q(Vd) and [K : Q] = 2 (we can assume that d is square- 
free), we have three possibilities for its decomposition. 


i) We can have pOx = pip2 where Np; = p; we say then that p is split in 
K. 
ii) We can have that pOx = p1 where Np = p’; we say then that p is 
inert in K. 
iii) We can have pO = p? where Np; = p; we say then that p is ramified 
in Kk. 
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These cases correspond respectively to s = 2, ey = €g = land fy = fo = 1; 
s = 1, e, = 1 and f; = 2; and s = 1, ey = 2 and f; = 1. We can also 
characterize them using the Legendre symbol. 


4,21. Theorem. Let K = Q(Vd) be a quadratic field, where d is square- 
free. If p is an odd prime number, then 
i) p is split in K if and only if (4) = +1; 
ti) p is inert in K if and only if (4) =-l; 
tii) p is ramified K if and only if (4) = 0, in other words if p divides d. 


For the prime 2 the decomposition law is given by 


i) 2 is split in K if and only if d= 1mod8; 
ti) 2 is inert K if and only if d= 5mod8; 
tii) 2 is ramified in K if and only if d= 2 or 3mod 4. 


Proof. If p is an odd prime, we have 
Ox |p = ZV d]/pZ[vd]. 


This is a trivial remark if d = 2 or 3 mod 4, and for the case d = 1 mod 4, 
it suffices to notice that if b is an odd integer, 


oto( 144) 0+ (252) a va) +0(44%4), 


2 
hence Ox = Z[Vd] + pOx. Next, we have the isomorphisms 
A:= 6x [pOx = ZV /pZ[Vd] © Z[X]/(p, X? — d)Z[X] 
= F,[X]/(X? — d)F,[X]. 
Therefore, we have the following three cases. Either X*—d can be factored 
in F,,[X] into two distinct factors, which corresponds to (5) = +1, hence 
A =F, x Fy, and p is split; or X? — d is irreducible in F,[X], which 
corresponds to (4) = —l, hence A = F,», and p is inert; or finally 
X*—d has a double root in F,[X], which corresponds to d = 0 in F, and 
d 


(5) = 0, hence A = F,[X]/X?F,[X], and p is ramified. 


If p= 2 and d= 2 or 3mod4, then we have 
Ox [20% & ZV d|/2Z[Vd] = Z[X]/(2, X? — d)Z[X] 
© F2[X]/(X? — d)F2[X] © F2[X]/(X — d)*F2[X], 


and hence 2 is ramified. Now, if d= 1 mod 4, since the minimal polynomial 
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f me p39? 8) , then we have 


4 
0x20 27[14V4) jog |2oYe 
~ Z[x]/(2,x2 x — 4=1)z)x] 
© Fy[X]/(X2 — xX — 4+ )P.[x] 


Thus if a = 0mod2, in other words d = 1mod8, then X? — X — 


oh X(X —1) in F2[X], hence Ox /20% & F2 x Fo, and 2 is split. But 


4 
if a 1 = 1mod 2, in other words d = 5 mod8, since X? — X — a ~ 


X? 4X +1 is irreducible in F2[X], then Ox /2O0xK & Fa, and 2 is inert. 


4.22. Remark. We will now consider an odd prime number p and kK = 
Q(Vd). We can see that there exists a prime ideal p in Ox such that N p = p 
if and only if p is ramified or split, in other words if and only if the Legendre 
symbol (4) equals 0 or 1. Thus we recover the congruence conditions for 


the solvability of the equation x? — dy? = p established in Example 1-3.6. 
We then know that this equation or the equation NG (a + jot 


as d—-1 
r vY 4 


congruence conditions are satisfied and the associated ideal p is principal. 


y? = p, where d = 1mod 4, has a solution if and only if the 
From this, we can deduce another proof of the two-square theorem. 


We have shown that every ideal is invertible modulo the equivalence relation 
given in 3-4.15 and can therefore talk about the ideal class group, denoted 
Cex or Pic(@x). Thus the ring @x is principal if and only if the group 
Cr is reduced to one element. We have seen that the rings @x are in 
general not principal, but we can nevertheless say that they are “almost 
principal” by the finiteness theorem, which we will now state. 


4.23. Theorem. The class group of ideals Clx of a number field K is 
finite. 


4.24. Corollary. Let hx := card(Clx), which is called the class number 
of K. For every non-zero ideal I in the ring Ox, the ideal I'* is principal. 
Conversely, if gcd(hx,m) =1 and I™ is principal, then I is principal. 
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4.25. Remark. The corollary above is essentially due to Kummer, who 
used the following variation of it: if K = Q(exp(2i7/p)) has the property 
that p does not divide the number of classes hx (we say then that p is 
regular), then an ideal whose pth power is principal is itself principal. This 
property, in addition to another one related to units in the always regular 
case, allowed Kummer to prove “Fermat’s last theorem” for all regular prime 
exponents. The smallest non-regular prime exponent is 37. Kummer’s 
proof follows the outline of the proof of Theorem 3-2.6; the first case is 
handled with the aid of congruences modulo A? and the second case with the 
aid of a descent where Kummer’s lemma on the units of a cyclotomic field 
plays a crucial role (for the details, see the book by Borevich-Shafarevich 
[2], or also [58] or [77]). 


In the following section, we will go over the main points of the proof of 
Theorem 3-4.23, as well as the structure of the group of units of Ox. These 
last two properties (finiteness of the class group and finite generation of 
the group of units) are not purely algebraic (they are moreover false for 
Dedekind rings in general), and the proof of these properties will rely on 
the geometry of numbers. 


5. Geometry of Numbers 


We will start with the following statement from topology, which generalizes 
Lemma 3-3.5. 


5.1. Proposition. A discrete subgroup G in R” has a basis over Z formed 
of r linearly independent vectors over R (where r < n); in particular, 


G=Z’. 


Proof. Let e1,...,e, be a maximal system of vectors of G which are linearly 
independent over R; it suffices to prove that Ze, +---+ Ze, is a subgroup 
of finite index in G. The intersection of G and the compact set Ko := 
{vier +--+ + 2,;e, | a € [0,1]} is a finite set. Now let « € G. It can be 
naturally written as x = 71e,; +---+42,e,, where x; are a priori in R. The 
vectors y(” := (ma1—|ma1|)e1+---+(ma,—|ma,]|)a,e, are all in GN Ko. 
If we let m vary from 0 to M := card(GN Ko), two of them will of course be 
equal, say y(™) = y(™2), This gives us x; = (|miai| — | maa; ])/(m1—mz), 
and hence, by letting d:= M!, we have 


Ze +--+ + Ler CGC 4 (Zey +++ + Zer), 


which finishes the proof. 


106 3. Algebra and Diophantine Equations 


5.2. Definition. Whenever r = n, we say that G is a lattice in R”; this 
boils down to requiring that G be discrete and that R"/G be compact. We 
therefore define the volume or determinant of a lattice G to be the absolute 
value of the determinant of a basis of G (with respect to the canonical basis 
of R”). 


5.3. Theorem. (Minkowski) Let K C R” be a compact, convex, and 
symmetric (i.e., x € K implies that —x € K) set. Assume that vol(K) > 
2”. Then there exists a non-zero x in KZ”. 


More generally, if A is a lattice and vol(K) > 2” det(A), then there exists 
a non-zerox in KOOL. 


Remark. The statement is optimal because, for example, the open cube 
defined by max; |x;| < 1 is convex and symmetric and has volume equal 
to 2”, while the compact cube defined by max; |x;| < 1 — € is convex and 
symmetric and has volume equal to 2"(1 — €)”. 


Proof. The second statement follows from the first by making a linear 
variable change which takes the lattice A to Z”. 

We set C := [0,1[". Let TC R”, and suppose that (T+ A)N (T+ ») =9 
for \# we Z”. This gives us T = Unyezn (TN (C+ d)), hence 


vol(T) = > vol (TN(C+2)) = YS vol (T= A)NC) 
AECL” AECL” 
= vol ((Uxezn(T = d)) N C) < vol(C) =e 


Conversely, if vol(T’) > 1, then there exists xr € TO(T+A) with 0 4 A € Z", 
and also there exists such a \ in T — T. We now return to the proof 


by letting T := 1K ={4 | a K}. Then we have kK = T —T and 


vol(T) = 2~"vol(K). If vol(ic) > 2”, then KM (Z” \ {0}) is nonempty. 


If vol(k) = 2” and K is compact, we get the same result. For every m > 0, 
the set K,, = (1+ 1/m)K contains a non-zero element x, in the lattice 
Z". The sequence (2), with values in the intersection of the compact set 
J, and the lattice Z”, contains an eventually constant subsequence, whose 
limit is « € Z” \ {0}. Furthermore, the point x is in NmsoMm, which 
coincides with the compact set K. 


Applications. We can use Minkowski’s theorem above to give other proofs 
of the two-square and four-square theorems. 


A prime number p = 1mod 4 is the sum of two squares. 
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Proof. Let a € Z such that a? + 1 = Omodp, and we define the lattice 
A:= {(x,y) € Z? | y = ax modp}. 


Then det(A) = p and vol(B(0,r)) = mr?, hence whenever mr? > 4p, there 
exists a non-zero vector in B(0,r)M A. We can choose r := \/4p/7. Then 
there exists a non-zero (x,y) € A such that 


O<a?+y? <r? =4p/n < Qp. 


So we have that x? + y? = (1 + a?)x? = 0modp, and hence 2? + y? = p. 


Every positive integer is the sum of four squares. 


Proof. Let n = p,---p, be square-free; it suffices to show that n is the sum 
of four squares. As in the first proof of Lagrange’s theorem, choose a; and 
b; such that 

a? +b? +1=0 mod pj. 


Consider the lattice given by 
A:= {x EZ‘ | 23 =a,2, + bq mod p; and 
v4 = bja1 — ayxg modp;, 1<i<r}. 


The volume of A is < (p1---p;)? = n?. We can choose p such that 


vol(B(0,p)) = Srp! = 2‘det(A), and then by Minkowski’s theorem, 


we have 0 4x € A such that 0 < 27 +23 + 23+ 23 < p? < 2n. However, 


zy + ay t+ a8 + 04 = 27 t+ 3 + (agri + byxe)? + (bir — ajx2)? = Omod pj. 


Thus n divides 1? + 23 + x2 + xf and hence 2? + 23+ 22+2$=n. 


5.4. Remark. We could also prove the four-square theorem by proving 
Jacobi’s formula (see Exercise 3-6.11). 
If we denote by ra(n) := card{(z,y, z,t) € Z* | 2? +y?+27+t? =n}, then 


8 and if n is odd, 
ra(n) =8 Dd= oer if n is even, ig) 
d\n ae 


A}d 


where n > 0. To see why this is true, the right hand side of the equality is 
clearly positive. This formula can be written in terms of generating func- 
tions, where we denote by r;(m) the number of ways to write n as the sum 
of k squares, i.e., r,(n) := card {(21,..., 2%) € ZF | ai +---+a2 =n}. 
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We will define the following series (formal or convergent if |q| < 1): 


so that 


ng” ; 
Z(q) = S- =. - S- a(n)q”, where o(n) = Sod. 
n=1 neN* d| 


Then Jacobi’s formula can be written as 


O(q)* =14+ 8 (Z(q) -—42(¢')). 


The following theorem of Hermite can also be proven using Minkowski’s 
theorem (although Hermite’s method provides a better constant 7). 


5.5. Theorem. There exists a constant y,, such that if Q:R” — R is a 
positive-definite quadratic form, then 


Q(x) < Yn (det Q)'/”. 


mn) = 2eZ"\ {0} 
Proof. Consider the ellipsoid Bg(r) := {x € R” | Q(z) < r?}; its volume 
is unr” /\/det(Q), where vp, is the volume of the unit ball for the usual 
Euclidean norm. We can choose r in such a way that this volume is equal 
to 2”. Therefore, there exists « € Z” \ {0} in Bg(r) which satisfies 


Q(z) <r? = (det Q)””. 
Un 


Let us now proceed to some applications to general number field theory. 


If K = Q(a) and P is the minimal polynomial of a, then n = [K : Q] = 
deg(P), and P has r; real roots aj,...,Q@,, and rg pairs of complex roots 
Op, 415 Bry $1; +++; Ory 4ro;Ar,tr, (80 2 = ry + 2rg). The embeddings from 
K into R are therefore given by o;(a) = a; (for 1 < i < r1) and the 
complex (nonreal) embeddings by o,,+4:(@) = Or, 44,0r,4i(@) = Gr, +4; (for 
1 < a < r2). 


5.6. Theorem. (Dirichlet’s unit theorem) Let K be a number field with ry 
real embeddings and rg pairs of complex conjugate embeddings. The group 
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of units OF, is a finitely generated abelian group, which is isomorphic to the 
direct product of the finite group of roots of unity of K and the free group 
Z’, where r:= 71 +7T2—1. 


Proof. We are going to prove this theorem for an integer r, where r < 
ry +1r2 —1 (see the remarks below and Exercise 3-6.23 for the complete 
proof). We will proceed as we did when we looked at the units of Q(Vd). 
To do this, we introduce the homomorphism L : G7, > R" +" defined by 


L(q) = (log lo1(a)|, ees , log lor, (a)|, 2 log lor, 41(@)I, a , 2 log fares (a)|) , 


and we will show that there are only a finite number of elements a € OF 
such that L(qa) is contained in a given ball in R™*"2, so that the kernel 
turns out to be finite, consisting of roots of unity contained in K, and the 
image L(@;,) turns out to be discrete. We can observe that the image is 
contained in the hyperplane 71 +--:+2,,+4r, = 0 because log | NG(a)| =0. 
Finally, since a discrete subgroup of R™ is isomorphic to Z", where r < m, 
we have the statement of the theorem when r < rj +72 — 1. 


5.7. Examples. An imaginary quadratic field satisfies r; = 0 and rp = 1, 
hence r = 0, ie., Of is finite. A real quadratic field satisfies rj = 2 and 
rg = 0, hence r < 1; actually r= 1 and Oj = +1} x Z as we have seen 
in Theorem 3-3.3. In the case kK = Q(¥/2), we have r; = r2 = 1 hence 
r <1, and so by setting a := v2 to lighten the notation, we see that 
(1+a+a’)(a—1) =1, hence 1+a+ a? is a unit and r = 1. In the case 
of K = Q(¢) where ¢ := exp(2z7i/p), we see that r1 = 0, re = (p— 1)/2, 
hence r < (p — 3)/2. It can be shown directly that r = (p — 3)/2 by 
checking that the units 7, := sin(ka/p)/sin(a/p), for k = 2,...,(p—1)/2, 
are independent and therefore generate a subgroup of rank (p— 3)/2 hence 
of finite index in OF. 


To prove the finiteness of the class group, we use the embedding of K into 
E:=R" x C? & RIFQI given by 
#(a) = (o1(@), 2203 Ory (a), Or, 41(Q), 26+ Or tre (a)) . 


We can show, as we did previously, that the image of (x is discrete. Since 
Ox = Z", the image is therefore a lattice with volume V = Vx. 


Now, we will prove that if rj +r2—1 > 1, then the group of units is infinite. 
We consider the convex set in R” = R™ x C"” given by 


B(ti,. Pe stim) = {x € R" x C” | |x| < ti}, 


whose volume is 2"'71"2ty +++ tp, (tr; 41°++tr;+r2)?. We can choose t; with 
volume equal to 2"Vx. Thus ty +++ tp, (tr, 41°*+ try tr)? equals a fixed con- 
stant, say Ax. Minkowski’s theorem therefore guarantees the existence of 
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a non-zero algebraic integer a such that ®(a) € B(t,...,t-,4r,) and hence 
such that |N(a@)| < Ax. By choosing, for example, t; smaller and smaller 
(and of course letting one of the t; get bigger and bigger), we can even 
obtain infinitely many elements of norm smaller than Ax. Since there are 
only a finite number of ideals of norm < Ax, we obtain infinitely many 
elements which generate the same ideal and whose quotients are therefore 
units. This argument is refined in Exercise 3-6.23 to provide a complete 
proof of the theorem with r; + rg — 1 independent units. 


Minkowski’s theorem also allows us to prove the following lemma. 


5.8. Lemma. There exists c, > 0 (which depends on K) such that if I 
is a non-zero ideal in Ox, then there exists a non-zero element a € I such 
that ING (a) <e,N(J). 


Proof. Let K; be the compact, convex, symmetric set in F defined by |x;| < 
t. Its volume is proportional to t” (where n = [K : Q]), more precisely 
vol(K;) = 2" 7"t”. The lattice ®(2) has volume Vx N(J). Therefore, if 
QW” = 2"VK N(I), we would have 6(J)N kK, # {0}. It follows that 
there exists a non-zero a € I such that ®(a) € K;, and hence INS (a)| < 
t” < ce, N(Z) where c, := (4/7)"Vx. 


The constant “c,” (or sometimes optimal value of this constant) is often 
called “Minkowski’s constant”. 


5.9. Corollary. The set of ideal classes of Ox is finite. 


Proof. Let c, be the constant given in the preceding lemma and set m := 
[ci |!. For a non-zero ideal I in @x, choose a to be a non-zero element in 
I with |N(a)| < c, N(Z). It follows that (I : a@K) < c1, and consequently 
mI C a@x. Then we set J := I, which is an ideal in @x in the same 
equivalence class as I since aJ = mI. Furthermore, since a € I, we have 
ma € aJ, and hence m € J. We already know that there exist a finite 
number of ideals in @x which contain m (they are in bijection with the 
ideals in Ox /mM@x). 


By using the fact that every ideal (or every class of ideals) is invertible, as 
well as the multiplicativity of the norm of ideals, we can control the finite 
set of classes. We will use the constant c, from the previous lemma (3-5.8) 
to do this. 


5.10. Corollary. Every ideal class in Gx contains an ideal with norm 
smaller than cy. 
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Proof. Let @ be an ideal class and J an integral ideal in the inverse class. By 
the lemma, there exists a € I such that ING (a)| < ci N(J). We therefore 
have a@x C I, and hence there exists an ideal J such that a@x = IJ. 
The ideal J is an integral ideal which belongs to the class @, and we have 
N(J) = |NG(a)|N()7? < 1. 


5.11. Remarks. We could give a more explicit expression for the con- 
stant c,; which appears in Corollary 3-5.10. We need to first define the 
discriminant, denoted Ax, of a number field K. To do this, we introduce 
a basis Q1,...,Q@,, over Z of Ox as well as the set o1,..., 0, of embeddings 
of K into R or C. Then, 


Ax := (det(o;(a;))* € Z. (3.6) 


An R-linear variable change (z, Z) ++ (Re(z),Im(z)) over the complex co- 


ordinates shows that 
VK = 277 \/|Ax|. (3.7) 


We can often compute the absolute value of this discriminant in the follow- 
ing manner (see Exercise 3-6.13): if @ is an algebraic integer of K, whose 
minimal polynomial is f(X) € Z[X], and if the index u := (@x : Zla]) is 
finite, then 

u"|Ax| = |NQ(f'(a))|- (3.8) 
This gives us that every ideal class of Gx contains an ideal of norm < 
(2/7)",/|Ax|. By looking at the decomposition of small prime numbers, 
we can, at least if the discriminant is not too large, deduce what the struc- 
ture of the class group Cx is. We can of course improve the bounds—see 
Samuel [7], for example—and obtain the following value for “Minkowski’s 


constant”: 
T2 ! 
“= (+) Lay (3.9) 
n 
This improvement allows us to establish Hermite’s inequality: 
[kK :Q] <c.log|Ax|, for kK 4Q, (3.10) 


The bound that we have obtained lets us determine what the class group 
is in the following examples and in Exercises 3-6.16, 3-6.17 and 3-6.18. 


5.12. Examples. 


1. Take kK = Q(iv19), then |Ax| = 19, and every ideal class contains an 
ideal with norm < 2/19/m < 3; however, we can check that 2 is prime 
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inOx = z[—tiv (cf. Theorem 3-4.21). The only ideal with norm 


<3 is the unit ideal and zj-tiv)) is thus principal. 


2. Take K = Q(iv23), then |Ax| = 23, and every ideal class contains 
an ideal with norm < 2V23/m < 4. We can check that 2 and 3 are 
sili tn he = zj- tives v3 


1+ iv/23 
2 


], but no element has 2 or 3 as a norm. 

However, N( 

( 1 +7723 
2 


only elements of norm 4 are +2. It follows that Céx = {1, [pi], [po]} = 
1+ iV23 
2 


) = 6. Thus 26% = pipe, 30x = pips and 


)\On = pip. The classes of p; and pe are distinct since the 


Z/3Z. In particular, Z/ ] is not principal. 
3. Take K = Q(V13), then |Ax| = 13 and every ideal class contains an 
ideal with norm < V13 < 4; however, we can check that 2 is prime 


in Ox = eemal (cf. Theorem 3-4.21) and next we check that 


1+ v13 
2 


N( ) = -—3 hence 3@K = pipe with pi,p2 principal ideals 


L+ V13 1+ VJ13 
2 2 

4, Take K = Q(V10), then |Ax| = 40 and every ideal class contains an 
ideal with norm < ./40 < 7; the primes 2 and 5 are ramified and 3 
is split in Ox = Z[V10] (cf. Theorem 3-4.21) thus 20% = p?, 50K = 
q? and 3@xK = 1,P2 and the class group is generated by these four 
prime ideals. We can check that no element has norm +2 or +3 (the 
equations x? + 2 = 10y? or x7 +3 = 10y? have no solution modulo 
10) hence p, pi, p2 are not principal. Notice that /10@xK = pq hence 
we may omit q; next N(2+ V10) = —6 thus (2+ V10)@x = pp; thus 
the class group is generated by p which is of order 2. It follows that 
Clr = (1, |[p]} & Z/2Z and the ring Z[V10] is not principal. 


generated by . The ring Z[ ] is thus principal. 


We will end this chapter by pointing out an often useful generalization 
about the ring of algebraic integers, Gx. For this, we make use of a finite 
set S' of prime ideals in Ox. 


5.13. Definition. Let K be a number field and S' a finite set of prime 
ideals in Ox. An element a € K is called an S-algebraic integer if for every 
prime ideal p ¢ S, ord,(a) > 0. We denote by Gx, the ring of S-integers 
and O; ¢ the set of S-units. 
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5.14. Remark. We can prove, with a variation on the proof of Theorem 
3-5.6, that the group Oj 5 has rank <r; + r2 —1+|S| (and actually this 
is always an equality). Moreover, by using Theorem 3-4.23, we see that we 
can always choose S such that @x,5 is principal. In fact, it suffices that 
the ideals whose classes are generators of Cl can be written as a product 
of ideals in S. 


6. Exercises 


6.1. Exercise. In this and the following two exercises, we denote by A 
and Ag the rings of quaternions defined on page 78. Prove that 


Z\i}* _ { 1, i}, Ab — { 1, I, J, kk} and 


At = Aju{ l+IltJ =) 

2 
(The group Aj is the quaternion group of order 8. The group A*, whose or- 
der is 24, is isomorphic to the group SL(2,F3)—see Exercise 3-6.3 below). 
Prove that Ag is not (left) principal. Prove also that an element with norm 
equal to a prime number is irreducible. 


6.2. Exercise. If B is a commutative ring, we define the ring Hg as the 
additive group 


Hp := {al+yl+2zJ+tkK | 2,y,2,t € B} 


endowed with the B-bilinear multiplication which has the same multiplica- 

tion table as H. Notice that Aj = Hz and AC a 1 | (We could also 
2 

consider Hp to be the “tensor product” Hp = Ap @z B.) 


1) Let F be a field of characteristic 4 2 which contains two elements a and 
b such that a? ++b2+1=0. Prove that the map from Hp to the algebra of 
2x 2 matrices with coefficients in F given by 


1 0 a b 0 1 =b 4 
ro (o ates S)r (4 o) eax (a 5) 


is an isomorphism of F'-algebras. 


2) If p is an odd prime, deduce from the previous question that Hp, is 
isomorphic to the algebra of 2 x 2 matrices with coefficients in Fy. 


6.3. Exercise. We will use the same notation as in the previous exercise. 
Our goal is to show that the group A*, formed of the invertible elements of 
the Hurwitz quaternion algebra, is isomorphic to SL(2,F3). 
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1) The homomorphism of reduction modulo 3 from Z [4] to F3 induces 


a ring homomorphism from H i) to Hp,. Deduce from this a group 


homomorphism @: A* + Hp, = GL(2, Fs). 


2) Let m € A* such that m? = 1 (resp. m3? = 1). Prove that ifm = 
1mod3, then m=1. Conclude that Ker(¢) = {1}. 


Hint. We can assume m #1. Write m asm =1+3"x where x € A, 
h>1 and x 40mod3, which leads to a contradiction. 


3) Conclude from this that A* is isomorphic to a subgroup of index 2 of 
GL(2, F3), so it must be equal to SL(2, F3). 


6.4. Exercise. Let a, 3 € k*, and let \/a be a root (in an extension of k) 
of X?-a=0. We set: 


Hanan {(etiva Meriva) 


i) Prove that H is a subalgebra of Mat(2 x 2,k(./a)) and that a basis is 
given by the identity 1 = Iz and the three matrices 


_ 1 O (0 6 = 0 £6 
r=va(j a) ine é ae K=va(%, ae 
Then check that the following multiplication identities hold: I? = a, J? = 
B, K? = —aB and 
IJ=—-JI=k, Jk=-kJ=-—I, KI=-IkK =-—-aJ. 


ii) Prove that an element q = Gee — of H \ {0} is 


wbodert, 


invertible if and only if 
det(q) = N(a+ bVa) — BN(c + dVa) = a? — ab? — Bc? + afd? $ 0. 


(If [k(.,/a) : k] = 2, the quantity N(a + b\/a) := a? — ab? is the norm of 
a+b,/a for the extension k(./a)/k.) Deduce from this that H is a division 
ring if and only if [k(./a) : k] = 2 and GB is not a norm of k(./a)/k. 


6.5. Exercise. Modify the proof of Minkowski’s theorem (Theorem 3-5.3) 
in order to obtain card (K M A) > 2”vol(K)/ det(A). 


6.6. Exercise. Prove that there exist constants C,, (and explain what they 
are) such that if L is a lattice in R” endowed with the Euclidean norm, then 
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there exists a basis, €1,...,€n, of L which satisfies 


det(L) < Jleil|--- len] < Cn det(Z). 


Hint.— The first inequality is true for every basis. For the second, look for 
a basis which is as close as possible to an orthogonal basis. 


6.7. Exercise. Let v,, be the volume of the unit ball in R”. For s > 0, 
we define the function T(s) := 1 ede. 

1) Prove that (1) = 1 and T(1/2) = ./n. Prove that the values of the T 
function at positive integers (resp. half-integers) can be computed with the 
formula T(s + 1) = sI(s). 

2) By computing the integral Jpn exp[—(aj +--+: + 22)|\dax,---da, in two 
different ways, prove that 


n/2 
Un => —————* 


r (4 1) 

5) + 
and in particular that Vam = 7™/m! and Vvom41 = Ta where a € Q* (also 
specify a). 


6.8. Exercise. Let q be a positive-definite quadratic form with integer 
coefficients and such that for all x € Q” there exists y € Z” such that 
q(a—y) <1. Letm € N; prove that there exists y € Z” such that q(x) =m 
if and only if there exists z € Q” such that q(z) =m. 

Hint.- If x € Z” such that q(x) = €m and ¢ > 2, choose y € Z” such 
that er —y) <1, and then wv’ := ax + by where a = q(y) —m and b= 


2(mé— B(x, y)). Then check that q(x’) = €’2m, where t! = lq + —y) <2. 


By using this property for q(a1,@2,73) = 27 + 23 + x3 and the Hasse- 
Minkowski theorem (Theorem 6-3.18 and Corollary 6-3.19), reprove the 
three-square theorem (Theorem 8-1.2). 


Deduce from this the following theorem due to Gauss: Every number m € N 
can be written as the sum of three triangular numbers (i.e., of the form 
x(a — 1)/2). 

Hint. Write 8m +3 as the sum of three squares x7 + x5 + x3 and observe 
that the x; must be odd. 


6.9. Exercise. We write the vectors of R” as column vectors. The group 
of nxn square matrices with coefficients in the ring A whose determinant 
det(A) € A* is denoted by GL,(A) (in other words, the set of invertible 
matrices in the ring Mat(n x n,A)). If A is a subring of R, we denote by 
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F(A) the set of symmetric, positive-definite matrices with coefficients in 
A. If moreover Q[z] = 'xQax is a quadratic form with integer coefficients, 
we say that it represents an integer m if there exists x € Z”" such that 
Q[z] = m. Finally, recall that two quadratic forms Q and Q! with integer 
coefficients are called equivalent if there exists U € GL,(Z) such that Q! = 
QUU] :="UQU. 


a) Prove that two equivalent forms represent the same set of integers. 


b) Let x © Z” such that ged(a1,...,¢%n) = 1. Prove that there exists a 
matrix U € GL,(Z) whose first column is x. 


c) Let Q be a matriz in F(R), and let 


m(Q) = Q[z]. 


min 
xEZr\ {0} 
Let x € Z” \ {0} be the minimal vector in the definition of m(Q). Prove 
that a matriz U € GL, (Z) can be constructed so that Q' = Q[U] satisfies 


Q [ex] = m(Q’) = m(Q). 


d) Let Q be a matrix in .%,(R) such that Q’ [e1] = m(Q’). Prove that a 
1 tb 


matrit U € GLn(Z) of the form U = (where b € Z"~! and V 


: V 
0 
is an (n—1) x (n—1) square matrix) can be constructed so that the matriz 


Q” = Q'[U] satisfies 


Q" [er] = m(Q") = m(Q’) and Q"[e2] = min Q" [2]. 
re Z 


gcd(x2,...,%n) = 1 


e) A matriz Q € %,(R) is called reduced if it satisfies the following prop- 
erty: 


Vee [In], Qlexd = min Qa]. 
xe Z” 
gcd(xz,..-,2n) =1 


By iterating the procedure from the previous questions, prove that every 
matriz in S,(R) is equivalent to a reduced matrix. 

f) Let Q € Y,(R) be a reduced matrix with coefficients q;,;. Prove that 
0< qi < 92,2 <---< dn and that 2\q.,5| < qa. 


g) Let Q € %,(R). Prove that there exist D = diag(di,...,dn) and T, 


an upper triangular matriz whose coefficients on the diagonal are all equal 
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to 1, such that Q = D[T]. Deduce from this Hadamard’s inequality: 
det(Q) < 91,192,2°** Inn: 


h) Let Q € %,(R) be a reduced matriz. Prove the existence of a constant 
C,, such that 
det(Q) < 1,192,2°** nn < Cn det(Q). 


(We can afterwards take Cy = 4/3 and C3 = 2 to be the first admissible 


initial values.) 


Hint.— If dann < di, then the proof is fairly easy; if not, there exists 
k <n—1 such that dnin K de+ijpti but Madek < dk+1,k41- We then have 


the decomposition Q = Qi 0 ) F e 


0 Qe 0 TL 
extracted from Q which will give us an inequality of the form dg+i,n+1 < 


| where Q; is akx k matrix 


2 
Fans + m(Q2). Finish the proof by applying Hermite’s theorem (8-5.5) 
to Q2 and an induction hypothesis to Qy. 


i) We denote by #,(D) the set of equivalence classes of matrices of %,,(Z) 
whose determinant equals D. Prove that h,(D) := card %,(D) is finite. 


j) Prove that ho(1) = h3(1) = 1. 


Hint.— It can be shown that any 2 x 2 or 3 x 3 reduced matrix which has 
determinant 1 is the identity matrix. 


k) Prove that a form Q = ((4i,j) )i<ij<n is positive-definite if and only if 


Vk € [1,n],  det((qi5))1<ig<k) > 0. 
l) Application. Let n be a positive integer. Suppose that you know a positive 
integer d such that —d is a square modulo r := dn —1. Deduce from this 
that n is the sum of three squares. First show that if —d = m? — fr (where 
£ has to be > 1), then the matriz 


£m il 
Q:=|[m r 0 
1 O 


is positive-definite and has determinant 1. Then show that n is simply 


Q(0,0,1). 


6.10. Exercise. Use the following hints and the previous exercise (3-6.9) 
to prove the three-square theorem (Theorem 8-1.2); we will also need Dirich- 
let’s theorem: “[f a and b are relatively prime, then there exists a prime 
number p = amod b.” 
a) Ifn = 2(2m +1) = 2mod4, prove that we can find a prime number p 
of the form p = (4u+1)n—1. Let d= 4u+1, and conclude that n is 
the sum of three squares. 
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b) Ifn =1mod8 (resp. = 3mod8, resp. = 5mod8), we set c= 3 (resp. 
c= 1, resp. c= 3). Prove that we can find a prime number p of the 


Ged. Let d = 8u+c (so that 2p = nd—1), and 


conclude that n is the sum of three squares. 


form p = 4un + 


Finally, prove that the three-square theorem follows from these statements. 


6.11. Exercise. (Jacobi’s four-square formula, see [79] and the book by 
Ireland-Rosen [5]) We would like to compute rz(m) := card{(a1,...,U%) € 
Z* | a? +---+22 =m} fork =2 and 4. To help us do this, we introduce 
the quantity 


Ny(m) := card{(21,..., 2%) € N* | 2; odd anda} +---+a2 =m}. 


a) Let x be the character modulo 4 which equals +1 (resp. —1, resp. 0) if 
x =1mod4 (resp. if x = —1mod4, resp. if x is even). Prove that 


N2(m) = 5° x(d), 
d|m 


and deduce that 


ro(m) =45~ x(d). 


d|m 


b) Prove that the following equalities hold for m = 4mod8: 


Na(m) = S~ No(m1)Na(m2) = S“(-1) 2, 
R 


where R is the set of pairs of natural numbers (m1, mz) such that my+me2 = 
m and m, = m2 =2mod4, and S is the set of quadruples of odd natural 
numbers (a,b,c,d) such that 2ab + 2cd = m. 


c) By settinga=x+y,b=2z-t,c=x-y andd=z+t, prove that 


Ss’ 


where S$’ is now the set of quadruples (x,y,z,t) € Z* such that |y| < 2, 
|t] << z, m=4(az— yt) and x and y (resp. z and t) have different parity. 
We denote by % (resp. “4, resp. 4) the sum restricted to y = 0 (resp. 
y >0, resp. y <0), so that N4(m) = .%+M%4+%. 

d) Prove that NM = diajmd (where the sum is restricted to being over 
odd divisors) and that ™% = %. Then show that ™ = 0 by using the 
variable change (and checking that it indeed defines a bijection from S" to 
S") a’ =2uz—-t, y =2, 2’ =y andt’ = Quy — x where u is chosen as the 
unique integer such that 2u—1<a/y < 2u+1. 
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e) Prove Jacobi’s formula (3.5) by successively showing that the following 
identities hold. For every m, we have r4(4m) = r4(2m); if m is odd, then 
ra(4m) = 16N4(4m) + r4(m); if m is even, then r4(2m) = 3r4(m). You 
can use the following “obvious” identity: 


(a + 2)? + (a1 — 22)" + (a3 +24)? + (23 — @4)* = 2a? + 2x3 + 2x} + 227. 


6.12. Exercise. Let N = pq be an RSA number gotten from two very 
large prime numbers. Let d be the public exponent and e the secret one, so 


that de = 1mod ¢(N). 


a) Observe that ¢(N) ~ N (size-wise), and show that there exists an integer 
k ~ ed/@(N) such that 


d k_1 g(a 1). 


N €  eN 7 N 


b) Use Theorem 3-8.15 to prove that if the absolute value of the right hand 
side is smaller than 1/2e”, then the continued fraction expansion algorithm 
of d/N gives a fast computation of k/e and hence of e. 


c) Prove that ife< zu, the previous condition holds. 


6.13. Exercise. Let K be a number field of degree n = [K : Q] and 
O1,---,0n its real or complex embeddings. For every lattice L = Za, ® 
-+- @ Zan, we set 

Az := (det(a;(a)))*. 
1) Prove that if L C @x, then Ay € Z and that if L’ is a sublattice of 
index u in L, then Ap = u?Ay. 


2) Let a be an algebraic integer such that K = Q(a), and let f(X) be its 
minimal polynomial. We set L = Za]. Prove that 


n(n—1) 


Ar=(-1) 2 NG@(f"(a)). 


6.14. Exercise. Let K be a number field such that Ox = Zla] for some a. 
We denote by f(X) the minimal polynomial of a. Prove that p is ramified 
in K/Q if and only if f(X) has a double root in F,. Deduce from this that 
p is ramified in K/Q if and only if p divides Ax. 

Remark. It can be shown (but this is more difficult) that this last conclusion 
is still true even if we do not assume the existence of a such that Ox = Zla]. 


6.15. Exercise. (“Computational” proof of the nondegeneracy of the 
bilinear form (x,y) > Tr(ay) = Tr (ay).) Let K be a number field of 
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degree n:=[K :Q]. We set, for 11,...,% € K, 
D Rips. - (fq) = det (Tee a;)). 


a) Let o1,...,0n be the real and complex embeddings of K. Check that 
D(a1,---;2n) = det (o;(x;))?. 


b) Suppose that the elements y; are gotten from the x; by a Q-linear trans- 
formation, A. Verify that therefore 


D(y1,---,Yn) = (det A)?D(21,...,2n)- 
c) Let a be a primitive element of K (i.e., such that K = Q(a)) and F(X) 
its minimal polynomial. Prove that 
n(n—1) 
D(l,a,...,a"~") =(-1)” 2 N@(F’(a)) £0. 


Deduce from this that D(21,...,2%,) #0 if and only if @1,...,% form a 
basis (over Q) of K and that the bilinear form given by the trace is non- 
degenerate. (This result is in fact valid for every finite separable extension 


L/K.) 


6.16. Exercise. In this exercise, you are asked to show that the ring of 

integers of K = Q(iVd), for d = 1,2,3,7, 11, 19, 43, 67, 163, is principal. 

i) For the first five values of d, the ring is Euclidean for the norm. 

ii) For d = 19,43, 67,163, the integer n = <a 
prime numbers <n are inert in K. 

iti) Check that every ideal of norm < 2Vd/n is principal and finish the 
proof by using Corollary 3-5.10 and the remarks that follow it. 


) is prime and the 


(Note: it is more difficult to prove—but true—that these are the only rings 
of quadratic imaginary integers that are principal.) 


6.17. Exercise. Let ¢ := exp(27i/5) and K = Q(¢), hence Ox = ZC]. 
Prove that Ax = 125 by using Exercise 3-6.13. Check that 2 and 3 stay 
prime in Ox and that 50% = ((1—)@x)*. Deduce from this that every 
ideal of norm < 6 is principal, and conclude that Ox is principal by using 
Corollary 3-5.10 and the remarks that follow it. 


6.18. Exercise. Let w := V2 and K = Q(w). Prove that the discriminant 
(in the sense of Exercise 3-6.13) of Z[w] equals +3°2?, and deduce that 
Ox = Zw] or (Cx : Zlw]) = 3. By considering the norm, trace, etc. of 
a+ bw + cw, conclude that Ox = Z{w). 
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Check that 2 = w3, 3 = (1+ w)3(w—1) and 5 = (14+ w?)(1 + Qw — w?). 
Prove that the elements w, 1+w, 1+w? and 1+2w—w? are prime and 
that w — 1 is invertible. 


Conclude that Z[ V2] is principal by using Corollary 3-5.10 and the remarks 
that follow it. 


6.19. Exercise. In this exercise, you are asked to study the integer valued 
solutions of the equation 

F(a,y,z) = 2? + 2y* + 42° — 62yz = m, (x) 
where m is a non-zero integer. 


We set w := W2 and K = Q(w). We will assume (see Exercise 3-6.18) that 
the ring of integers of K is equal to Ox = Zlw]| and that it is principal. 

1) Prove that 1,w,w? form a basis for K over Q, and compute 
NG (@ + yw + zw). 

2) Let p be a prime number. Prove that either p stays prime in Ox or there 


exists an ideal p in Ox with norm p. 


Hint. Find the decomposition of pO into products of prime ideals and 
enumerate the possibilities. 


3) Finda € Ox (resp. B © Ox) such that NG (a) = 2 (resp. NG(8) = 3). 


4) Let p # 2,3. Suppose that there exists an ideal p in Ox of norm p; prove 
that there exists a € F), such that a? = 2. 


5) Let p # 2,3. Suppose that there exists a € Z such that a® = 2modp; 
prove that there exists an ideal p in Ox of norm p. 


Hint. You can look at the factorization a? —2 = (a—w)(a? +aw+w?) and 
deduce that p is not prime in Ox. 


6) Check that if p stays prime in Ox, then p = 1mod3. Prove by coun- 
terecample that the converse is false. (Check the case p = 31.) 


7) We can writem = +[J,p™". Prove that equation (*) has an integer 
solution (x,y,z) if and only if for every prime number p # 2,3 such that 
the congruence a® = 2modp does not have a solution, the integer mp is 
divisible by 3. 

8) Prove that if equation (*) has an integer solution (x,y,z), then it has 
infinitely many. 


6.20. Exercise. 1) Let a be an algebraic integer with minimal polynomial 
P= X44 py_1X14+---+p9 € ZX] which generates the number field 
K = Q(a). Let p be an odd prime number. Suppose now that Ox = Zia] 
(or more generally that p does not divide (Ox : Zlal)). Prove that if the 
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reduction modulo p of the polynomial P can be factored in F,,[X] into 
PSPs, where the P; are irreducible and distinct in F,[X], 
then Ox /pOK = Fp|X]/(Py") x +--+ x Fp[X]/(P&) and px = py --- pr 


a ae) 


where the p; are prime ideals of norm N(p;) = p**8?)). (Recall that p is 
said to be ramified in the extension K/Q if one of the e; is > 2.) 


2) Let ®,, € Z[X] be the mth cyclotomic polynomial. Recall how to factor 


®,, in F,[X]. (Treat the case where p divides m separately.) 


3) Let € be a primitive mth root of unity and K = Q(¢). Since Q(¢) = 
Q(—¢), we can assume that either m is odd or 4 divides m; we will also 
assume that Ox = Z[¢]. Prove that p is ramified in the extension K/Q if 
and only if p divides m. Assuming that p is relatively prime to m, we let r 
be the order of p modulo m. Prove that 


POK =P Pg(m)/r> where Np; =p". 
We will now assume that m = 5, ¢ := exp(27i/5) and K = Q(¢). 
4) Give a necessary and sufficient condition for an integer n € N* to be 
the norm of an ideal in Ox. 
Hint.— Look for a condition in terms of the factorization of n. 
5) We will consider the Gauss sum 
2 
T= ye exp (22 ) i 
“ZeFs 
Find, up to sign, the value of tT, and deduce from this that K contains the 


real quadratic field Q(/5). 


6) Prove that € := ae 


of units OF. 


generates a subgroup of finite index of the group 


7) Supposing that Ox is principal (see Exercise 3-6.17), describe the set of 
solutions (x,y, z,t) € Z* of the equation 


NG (a@ + y6 +207 +t¢?) =m 


form =2, m=5, m= 24-11 =176. In particular, specify whether the set 
of solutions is empty, finite or infinite. 


6.21. Exercise. In this exercise, you are asked to determine which natural 
numbers can be written in the form x? + 3y? or (equivalently) in the form 
x*—ayty’. 

1) We let j := altiv3 | Consider the rings Ag = Z[iV/3] and A = Z[j] 
Prove that A is principal and factorial, but that Ap is not even factorial. 
Specify the group of units of Aj and A*. 
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2) Let the norm be defined by N : Q(iV3) — Q. Check that N(a+ biv3) = 
a? + 3b?, N(at+yj) = x? -ay+y’, and prove that an integer n is the norm 
of an element of Ao if and only if it is the norm of an element in A. 
Hint. You could show that if a is in A but not in Ao, then ja or j7a is 
mn Ao. 

3) Let p be a prime number not equal to 2 or 8. Prove that if p is the norm 
of an element of Ao (or A), then —3 is a square modulo p and deduce from 
this that p = 1mod3. 

4) Let p be a prime other than 2. Prove that if p = 2mod3 and n = mp 
is the norm of an element of Ao (or of A), then m= n'p and n’ is also a 
norm. 


5) Prove that 2 is an irreducible element in A. Deduce from this that if 
n = 2m is the norm of an element of Ao (or of A), then m = 2n’ and n’! 
is also a norm. 


6) Now assume that p= 1mod3. Prove that —3 is a square modulo p and 
deduce from this that p is not irreducible in A and, consequently, that it is 
a norm. 


7) By using the previous questions, prove the following result. 


An integer n > 1 can be written as x7 +3y? or, equivalently, as x? —xy+y? 
where x, y € Z if and only if for every prime p = 2mod 3, ord,(n) is even. 


8) State and prove a similar result with the norm associated to a quadratic 
field whose ring of integers is principal. 


6.22. Exercise. Prove that the only integer solutions of the equation 


yi =2°-2 


are (x,y) = (3,45). 
Hint.— First show that x,y must be odd. By working in A := Ziiv/2], prove 


that y + i/2 must be a cube in A and conclude by identifying the real and 
wmaginary parts. 


6.23. Exercise. In this exercise, you are asked to finish the proof of 
the unit theorem, by constructing r1 + r2 —1 independent units. (You can 
assume that ry +r2—12> 1.) 


a) Let A = (aj,;) be anr x r matrix such that Vi, |aii] > Dj; i,j. Prove 
that A is invertible. 
Letr =r, +r2—1 and ®: 0x ~R" x C™ =: E be the usual embedding. 


We denote by |x|; := |oi(x)|, fori =1,...,71 + 12, the absolute values as- 
sociated to the different embeddings (resp. pairs of conjugate embeddings). 
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b) Let C > 1. By repeating the proof of the existence of infinitely many 
units, prove that a unit €; can be constructed for every absolute value such 
that 

les; >C, and |ej|;< 1/4, foriFy. 


Hint.— Construct, using Minkowski’s theorem, algebraic integers such that 
lal; < 1/4 fori #7 and ING (a)| < C, and use this to find a subset which 
generates the same ideal. 


c) Prove that €1,...,€, are independent. 


6.24. Exercise. Let a = W2 and K = Q(a). In this exercise, you are 
asked to prove that the equation 


NG (a + 4y + za + wa’) — 6(x + y)(x? + zy + Ty*) = 0, (3.11) 


which has nontrivial solutions modulo N for every N > 2 (cf. Exercise 

1-6.26 by taking w =0), does not have any nontrivial integer solutions. 

1) Let (x,y,z, w) € Z* be a primitive solution. We let d = gced(a, y). Prove 

that gcd(6, d) = 1 and that 3 does not divide f(x,y) := x7+axy+Ty?. Check 

that, in particular, c 4 ymod3. 

2) i) Assume that p does not divide d. Prove that if p = 2mod3, then p 

does not divide f(x,y) and that if p= 1mod3 and 2 is not a cube modulo 

p, then p does not divide f(x,y). 

Hint.— If not, p would divide x + 4y and this would force p to be equal to 

19. 

ti) Finally show that if p = 1mod3 and 2 is a cube modulo p, then there 

exist integers a and b such that p = a? + 27b? = (a+ 3biV/3) (a — 3biV/3). 

Hint.- We know (a+ bj)? = A+3Bj. A norm of K(iV3)/Q(iV3) can 

therefore be written as A+ 3Bj. 

1+iv3 
2 


3) Let p:= , 30 we have the factorization 


f(x,y) = (@ + y(3p — 1))(a + y(2 — 8p)). 


Deduce from the previous arguments that there exist integers a,b and m 
such that 7 
z+y(3p — 1) = p™ = a4 3biV3. 


4) By reducing modulo 8, prove that p™ = +1, then that y is even and x is 
odd. Conclude by referring back to the equation. 


Note: this example is due to Birch and Swinnerton-Dyer and was taken 


from [13]. 


Chapter 4 


Analytic Number ‘Theory 


“ Eh! qu’aimes-tu donc, extraordinaire étranger? 
— J’aime les nuages...les nuages qui passent...la-bas...la-bas... 
les merveilleux nuages!” 


CHARLES BAUDELAIRE 


The theme of this chapter is the distribution of prime numbers. We will 
begin by giving some statements and relatively elementary proofs, before 
introducing the key tool: the classical theory of functions of a complex vari- 
able, of which we will give a brief overview. The two following sections 
contain proofs of Dirichlet’s “theorem on arithmetic progressions” and the 
“prime number theorem”. Dirichlet series and in particular the Riemann 
zeta function play a fundamental role. We will illustrate this by additionally 
proving the functional equation of the zeta function and by formulating the 
famous Riemann hypothesis. 


1. Elementary Statements and Estimates 


The (written) history of prime numbers generally begins with the following 
theorem. 


1.1. Theorem. (Euclid) The set of prime numbers is infinite.* 
Given a finite list of prime numbers, p;,...,p,;, Euclid’s argument consists 


of constructing a new prime number by considering possible prime factors 
of N := py-:-- pr +1. 


1Buclid’s statement of course does not mention infinity; it says that given a finite 
collection of prime numbers, one can deduce another one from it. 


M. Hindry, Arithmetics, Universitext, 125 
DOI 10.1007/978-1-4471-2131-2 4, 
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There are many ways to expand on Euclid’s proposition. 


1 


1.2. Proposition. The series with terms log(p)p—! and p~' are divergent. 


To be more precise, 


l 
y og(p) = log xz + O(1) and S$ =lowloge ++ 0 ( 1 
ee re ee 


(4.1) 
These statements can be refined as follows. 


1.3. Theorem. (Prime number theorem) As x tends to infinity, we have 
the following asymptotic behavior: 


a(x) := card{p prime, p<a}~ (4.2) 


log x 


We could state various equivalent forms of this theorem, for example, 
O(a) ~ x; W(a)~« oralso. py ~ nlogn, 


where we let (following Tchebychev) 


A(z) =: Sologp, v(x) = D> logp (4.3) 


pKu ™m <x 


and where p,, denotes the nth prime number. We will also prove the fol- 
lowing theorem. 


1.4. Theorem. (Dirichlet’s theorem on arithmetic progressions) Let a,b > 
1 be two relatively prime integers. Then there exist infinitely many primes 
p of the form a+ bn. 


We could make this statement more precise by showing that prime numbers 
are distributed more or less uniformly over the congruence classes modulo b. 


1.5. Theorem. With the same hypotheses as before, we have 


1 _ loglogz 1 
pe hy tat O (wae) 
PRX 

p=amodb 


We will now give a more refined statement but will not however prove it. 
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1.6. Theorem. As x tends to infinity, we have the following asymptotic 
behavior: 


n(x; a,b) := card{p prime , p< x, p=amodb} ~ ——*—. (44) 
$(b) log x 


In this section, we will expand on some so-called “elementary” methods 
(which in this context means that they do not involve complex variables) 
which can be used to prove the previous assertions, except for Dirichlet’s 
theorem on arithmetic progressions and the prime number theorem. They 
will however allow us to prove a partial version: there exist two constants, 
C1, C2 > 0, such that c1x/logx < m(x) < cou/log a. 


1.7. Lemma. The following estimate holds: nlog2 < log () < nlog4. 


Proof. From the binomial theorem, we know that (*”) < ae ee 


2 2n)! 
(1+1)?” = 4”. Next, we have the following lower bound: ( >) = wy = 
n 


(n!)? 


2n(2n —1)---(n+1) 
n(n —1)---1 


> 2”. 


1.8. Lemma. The following formula holds: ordp(n!) = >o,,54 Fat 
=" LP 

furthermore, the sum can be restricted to m < log n/ log p. 

Proof. Write n! =1-2:3---n =[];_, &. The number of integers < n which 

are divisible by p is |n/p|, and the number of integers < n divisible by p? 

is |n/p?|, etc. Thus ord,(n!) is the sum of the |[n/p™|. Finally, p”™ <n is 

equivalent to m < log n/ log p, hence the first statement is proved. 


We can therefore write 


2n\ _ 2n = 2n. | _ nN | 
lo — ord log p = 2 log p. 
e(i) = one (ii) mmo 3 (LL pe|) me 
(4.5) 
To find a lower bound, we only keep the terms that satisfy n < p < 2n. In 
fact, such a p clearly divides ey) = (2n)!/(n!)?, and thus we obtain 


2 
nlog4 > log ( ") > a log p = 6(2n) — O(n). 
Me n<pK2an 


From this, we obtain an upper bound of the form @(x”) < Ca. This is true 
because 
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m—-1 m—1 
a(2™) = N° 0(2**1) — 0(2*) < S> 2* log4 = (2” — 1) log. 
k=0 k=0 
Therefore, if 2" < a <2™+!, then 
O(x) < A(2™*1) < 2™*1 log 4 < (2log4)z. (4.6) 


To obtain an upper bound, we could notice that |2u|—2|u] always equals 
0 or 1 and equals 0 whenever wu < 1/2. Thus 


2 
ntog2 <tox (*") = > [>] 22] -2| | ) tose 
n Pp 


p<K2n \me2l1 P 


< > ( oa) ) log p = log(2n)n(2n). 


logp 


pKan 


From this, we have a lower bound of the form (x) > Ca/loga. This is 
true because if 2n < # < 2(n +1), then 


(a) > (an) > 0B? > (2 1) Be. (4.7) 
log(2n) 2 log x 
Furthermore, we can easily see that 
A(x) = S- log p < log x x 1 = 7(2) log a. (4.8) 
p<a p< 


Next, notice that for 2 <y <a, 
1 1 
T(x) — 7(y) = 1< — log p = —— (A(x) — O(y)). 
(@)— mw) = YO 1< Gh Y loep= A la) - aw) 
It follows that 
A(x) 
+ 
log y 


By choosing y = x/(log x)? and by recalling the previous inequality (4.8), 
we have 


0(x) 


< 
ae log y 


my) < 


+ Yy. 


0(x) 


Be) cate) < pee tor (49) 


~ logr+2loglogx ° (log x)? 


To summarize, it is easy to see from inequalities (4.6), (4.7), (4.8) and (4.9) 
that (0(x”) ~ x) is equivalent to (a(x) ~ x/log x) and that 


Cin < W(x) < Cox and C3a/logx < 1(x) < C4/logx. (4.10) 


Furthermore, the following comparison of the function 6(x) to the function 
w(x) is not difficult to see: 
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A(x) < Y(a):= SY > logp = 0(x) + (Vx) + O(a) + 


p™ <a 


< 6(x) + we A(z) < A(x) + Clog aV/Z. 


Finally, if we denote by p,, the nth prime number, we have 7(p,) = n by def- 
inition. The prime number theorem therefore implies that n ~ p,/log(pn) 
and that p, ~ nlogn. We can check that the latter statement is in fact 
equivalent to the prime number theorem. 


1.9. Lemma. (Abel’s formula) Let A(x) := )7,<,@n and f be a function 
of class @!. Then, 


x 


» an f(n) = Ala) fle) — Aw)ftw) — f A(t) f'(t)dt. (4.11) 


Y<nKe 


Proof. We first point out that fer A (t)f’(t)dt = A(n) tie f'(j)dt = 
A(n) (f(n + 1) — f(n)). Therefore, setting N = |x| and M = |y] yields 


ai N-1 
= [a "iat = ST An + 1) (F(n) — F(n)) 


a n=M A= 


= s f(n)(A(n = 1) = An) + FINDA) — ACM) FO) 


n=M-+1 
= f(n)an + f(N)A(N) — A(M) f(M). 
n=M-+1 


This proves the formula when x and y are integers. For the general formula, 
observe that 
x 


A(t) f'(t)dt = A(x) (f(x) — f(L2})) = A@) f(@) — A(x) F (Le). 


|x] 


Applications. 1) The formula gives a fairly precise comparison between 
the “sum” and the “integral” (see Exercise 4-6.10 for some refinements). To 
be more precise, if we take a, = 1 and integrate by parts, we have: 


=f F(t yar fo (t— [t]) f" (dt. (4.12) 


n= a 


If we choose f(t) = 1/t, we obtain 
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where y :=1— [,*(t— [t]) ¢ is Euler’s constant. 
2) Take a, = 1, so A(t) = |[t], y= 1 and f(t) = logt. We therefore have 
t|dt 
tog ((e}!) = [x tog(«) — f° 


=zlogx r dt + (|x| a)logz— fo te at 


=azlogx—2+ O(log z). 


We should point out that Stirling’s formula gives a slightly more precise 
statement, namely n! ~ n”e~"V/27n, and hence log(n!) = nlogn — n+ 
+ logn + + log(27) + e(n) where lim, ... e(n) = 0. 


Furthermore, we see that 


log ({a]!) =)" ord, ( (|2]!) log p 


peu 
=D ¥D | F] toe 
px«me1 
=e 4 + Sogn ((4]-4)+ o> | | toe 
pKu pKa px m>2 
=r), 2? + O(@), 
pKu 


where the last estimate comes from the upper bound 6(2) = Lee log p = 
O(a), from (4.6) and the estimate 


LL|s £-| loge <2 > wep =a) EP = Of0). 


-1 
pKuzme2 pKume2 peu P(p ) 


From this, we can deduce the first formula in Proposition 4-1.2, 


S- oe? = loga + O(1). (4.13) 


pKx 
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To get the second, we apply Abel’s formula (Lemma 4-1.9) letting f(t) = 
1/logt and a, = log p/p if n = p is prime and a, = 0 otherwise. By setting 


A(z) = Voo<e as we have that 


P 
s + =) Gat) 
p<x nxn 
log x +f t(log t)? 
: ; = at ® (A(t) —logt) 
=1+O(1/log +f tlog t H) t(log t)” " 
(t) — logt) 


(A 
= loglog —loglog2 + 1+ f ( dt + O(1/log x). 
2 


t(log t)? 


2. Holomorphic Functions 
(Summary /Reminders) 


This section, without proofs, is a summary of some of the fundamental 
properties of functions of a complex variable that we will be using. It could 
be helpful to use [74] as a reference. 


Concerning series, we will use the product rule for calculating the product 
of two absolutely convergent series: 


(2) (Es) -EE) 


as well as rearrangement of the order of summation in a series with positive 


eS) 8) 


n=0 m=0 


A power series S(z) = )7*° 4 nz” is said to have a radius of convergence 
R > 0 (possibly R = 0 or R = +00) if the series converges for all |z| << R 
and diverges for all |z| > R; furthermore, the convergence is absolute in 
the interior of the disc of convergence and the function is of class @° with 
S®(z) = 3° n(n—-1)---(n—k+1)anz"-*. In fact, the function S can 
be expanded as a power series around every point zo € D(0, R), in other 
words, for every z € D(2o,r) C D(0, R), we have S(z) = or- 4 bn(z— 2)” 
(with b, = S\(z)/n!). Such a function only has a finite number of 
zeros in every closed disc (or compact set) which is contained in D(0, R). 
We define the multiplicity of a zero, zg, as the integer k such that S(z) = 
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(z—20)" Og bn(z— 20)” where bo 4 0. A function which can be expressed 
a power series in a neighborhood of every point is called analytic. 


2.1. Definition. A holomorphic function f : U — C on an open set U is 
a function which is (complex) differentiable at every point in U, i.e., 


am £02) - £0) 


, . 
= C t . 
jim — [=a f' (20) € exists 


If F is a closed set in the complex plane, f is said to be holomorphic over 
F if it is holomorphic over an open set U which contains F’. 


Of course the notions of “differentiable” and “analytic” are very different in 
a real variable; in a complex variable, however, they are equivalent. 


2.2. Proposition. Let f : U — C be a holomorphic function and as- 
sume that D(zo,r) C U. Then for every z € D(z,7r), we have f(z) = 


bee Gn (Zz — 20)", where an = f(z) /n!. 


2.3. Proposition. Let f : U — C be a holomorphic function and assume 
that U is connected and f is not identically zero. Then the set of zeros of 
f is discrete in U. 


2.4. Corollary. Let f,g : U — C be two holomorphic functions and 
assume that U is connected. If the set {2 € U | f(z) = g(z)} is not 
discrete in U, then f = g. In particular, a holomorphic function on a disc 
D(z0,7) CU admits at most one analytic continuation to all of U. 


Next, we will define meromorphic functions as functions which are holomor- 
phic on an open set U except for at the poles. At a pole zo, a meromorphic 
function has the following behavior: there exists an integer m, called the 
order of the pole, such that the function (z — zo) f(z) has a holomorphic 
continuation in a neighborhood of z and is not equal to zero at zp. This 
is the same as saying that f(z) can be written, in a neighborhood of zo, as 


Am Am—1 ay a holomorphic function 


(z — 2)" Gyr zZ— 2% atz. 


f= 


The coefficient a, is called the residue of f at zo and is denoted by Res(f; zo). 
Its importance comes from its usefulness in calculating integrals. 


We define the integral along a path as follows: for y : [a,b] — C of class 
G', we set 


b 
[ted= [ sowy'wa. 
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The variable change formula shows that the value of the integral does not 
depend on the parametrization of the path but does, however, depend on 
the direction in which you integrate along it. For convenience sake, we will 
call a simple contour a path +: [a,b] — C such that y(a) = 7(0), but + is 
injective on [a, b[ and travels in the counterclockwise direction. A theorem 
due to Camille Jordan shows that any such a contour partitions the plane 
into two connected parts, the interior and the exterior. 


2.5. Theorem. (Residue theorem) Let f be a meromorphic function on 
U. Let y be a simple contour which does contain any poles of f and S the 
set of poles of f in the interior of y. Then, 


if f(z)dz = 2ni Res(f; a). 


acs 


If U is simply connected, i.e., “without holes”, then if f is holomorphic on U 
and 71,72 are two paths in U, both of which join a and 6, then Joy f(z)dz = 


te f(z)dz. Thus we can define an antiderivative of a holomorphic function 
f(z) on such an open set by the formula F'(b) = 1 f(z)dz, where y is a 
path in U which joins a and b. 


2.6. Proposition. Let f,(z) be a sequence of holomorphic functions from 
U to C. If the sequence converges uniformly on all compact sets in U to 


a function f, then f is holomorphic, and the kth derivatives fi) converge 
uniformly on every compact set in U to the function f®. 


We will expand on this point with the example of series of functions. Let 
(tn(z)) be a sequence of holomorphic functions such that the series S(z) := 
rg Un(z) converges; suppose moreover that it converges uniformly on 


ee Un(zZ) | gue 0 when M and 


N tend to infinity, and K C U is any compact subset. Then the function 
S(z) is holomorphic, and 


S®(z) = ss ul) (2), 
n=0 


every compact set, in other words, 


2.7. Example. (The complex logarithm). The function exp(z) = e* = 
rg 2" /n! is holomorphic on U = C, and the series converges uniformly 
on every disk centered at 0 with radius R. We define 


Fe) = (yn ESM" _ ye Goa" 


n=1 n=1 
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The series converges normally at every point in the open disk D(1,1) = 
{z € C | |1— | < 1}, and the convergence is uniform (and also normal) 
on every closed disk with center 1 and radius r < 1. Therefore, F(z) is 
holomorphic on D(1,1). If z is a real number in the interval ]0,2[, we can 
see that F(z) = log z (ordinary logarithm), and in particular, 


exp (F(z)) = z. 


The previous formula indicates that the two functions, the identity and 
exp oF’, which are analytic on the disk D(1,1), coincide on the segment 
]0,2[ and hence on the whole disk. Thus F' defines a complex logarithm on 
the disk |z — 1] < 1. 


2.8. Definition. Let f(z) be a holomorphic function on U. We say 
that F(z) is a branch of the logarithm of f on U (and we write, with a 
slight abuse of notation, F(z) = log f(z)) if F(z) is holomorphic and if 
exp (F(z)) = f(z). 


2.9. Remark. If F(z) is a branch of the logarithm of f, then f is never 
0 on U, we have |exp (F'(z))| = exp (Re F(z)) = |f(z)|, and hence 


Relog f(z) = log |f(z)|- 
Likewise, f’(z)/f(z) = F’(z) exp (F(z)) / exp (F(z)) = F’(z), and also 
d f(z). 
f(z) 


Finally, if F, and F are two logarithms, then F2(z) = F)(z) + 2k7i on any 
connected set U. 


tog f(2) = 


This remark suggests that we should construct the logarithm of f(z) as an 
antiderivative f’(z)/f(z), with the condition that f is not zero. We have 
seen that this is possible if U is simply connected. 


2.10. Proposition. Let U be a simply connected open subset of the 
complex plane and f(s) a holomorphic function without any zeros in U. 
Then there exists a holomorphic branch F(s) = log f(s) on U. Two such 
branches differ by an integer multiple of 271. 


We will finish this summary by explaining the notion of an infinite prod- 
uct. The first idea consists of saying that a product is convergent if 
limy ier Pn exists. This could be confusing because it is not true that 
such a product is zero if and only if one of the factors is zero. For example, 
it can be easily checked that 
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N 
en (- i 0 
N—-oo 1 


We can overcome this inconvenience by defining infinite products a little 
differently. Observe first that a necessary condition for the convergence 
of a non-zero product [J,, pn is to have lim, py, = 1; it therefore does not 
hurt to assume that p, = 1+ u, where u, tends to zero. In particular, 
log(1 + un) = S22, (-1)*uk/k is well-defined when |u,| < 1 and hence 
when n > no, which justifies the following definition. 


2.11. Definition. A product []>-_9(1+un) is convergent (resp. absolutely 
convergent) if there exists no such that |u,| < 1 for all n > no and the 
series }7~~,,, log(1 + un) is convergent (resp. absolutely convergent). A 
product of functions []*~_9(1+un(z)) is uniformly convergent on K if there 
exists no such that |un(z)| < 1 for all mn > no and z € K and the series 
sor, log(1 + un(z)) is uniformly convergent (on K). 


n=nNno 


2.12. Lemma. A product P := [[>-)(1+ un) is absolutely convergent 
if and only if the series 7~° 4 |un| is convergent. If P is convergent, it is 
zero if and only if one of the factors 1+ un is zero. 


2.13. Proposition. Let (u,(z)) be a sequence of holomorphic functions 
on an open set U such that the series )_,, log(1+utn(z)) converges uniformly 
on every compact subset of U. 

i) Then the function defined by the infinite product 


Co 


P(2) := [] (1 + wa(2)) 


n=0 
is holomorphic on U. 
ii) For every zo € U, only a finite number of py(z) := 14+ Un(z) are zero 
at zo, and hence 


ord,, P(z) = S- ord,, Dn(z). 
n=0 


3. Dirichlet Series and the Function ¢(s) 


We call a Dirichlet series a series of the form F(s) = >>, < - We will 


now state its first important property. 
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3.1. Proposition. Let F(s) = ~*~ “” be a Dirichlet series that we 


n=1 ns 
will assume to be convergent at 59. Then it converges uniformly on the sets 
Eos) = {s € C | Re(s — so) > 0,|s — 50| < C Re(s — so)}. 

Proof. For M > 1, set Ayy(x) := a eee ann *°. By the hypothesis, 
we then have |Ajs(x)| < €(14) where e€(M/) tends to zero as M tends to 
infinity. Abel’s formula gives 


) Ann * = ) Ann n~ 8-80) 


M<n<N M<n<N 


N 
= Ay(N)N~€-9) + (850) fo Aly Ge Mat 
M 


We can find an upper bound for the integral as follows: 


N N 
: Am (t)t7 &-8tY dt} < (a) f phe earth) de 
M M 


M-(¢-20) — N-(e-¢0) 


(o — 00) 


= «(M) 


By restricting to an angular sector Ec,s, bounded by o — 09 = Re(s) — 
Re(s9) > 0 and |s — so| = C(o — 00), we obtain 


S> ann *| < e(M)(1+C), 


M<n<N 


which suffices to show the uniform convergence on this sector (cf. the figure 
below). 


0 — 0g 
cos 0 


the domain |s — so| < 


00 on 


The following corollary is a result of the general theorems recalled in the 
previous section (in particular, Proposition 4-2.6). 


3.2. Corollary. Every Dirichlet series F(s) = ~~ on has an abscissa 


n=1 ns 


of convergence, say do, such that the series converges for Re(s) > og and 
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diverges for Re(s) < oo. Furthermore, the function F' defined by the se- 
ries is holomorphic in the half-plane of convergence Re(s) > a9, and its 


derivatives are given by F“)(s) = 7°, an(—logn)kn-. 


Proof. It suffices to let o9 = inf{o € R | the series converges at o}, then 
to observe that every compact set in the (open) half-plane of convergence 
is contained in a sector, as above, where the convergence is uniform. 


3.3. Remarks. 1) If a Dirichlet series converges at so = oo + itg to 
the number S, then the proof of Proposition 4-3.1 (above) shows that 
= lim,_.9+ F(a0 +e+ ito). 

2) Set A(t) := )7,<;@n. The previous proof allows us to establish the 
formula 


sor = | po ae | A(t)t~*-1dt. (4.14) 
n=1 1 i, 


nxt 


In particular, if A(t) = >7,,<;4@n is bounded, then the series converges 
whenever Re(s) > 0. 


The most famous Dirichlet series is the Riemann zeta function, defined 
by the series }°°°_, n~*. It is well-known, at least for real values and the 
general case follows from the real case, that the abscissa of convergence is 
+1. 


3.4. Theorem. (Euler Product) If Re(s) > 1, then the following formula 


holds ; 
1 1 
== (1-4) . (4.15 
n Pp 
n=1 p 
Proof. Notice that ae |p-*| = pe md < ¥o,,72-%, hence the product is 
absolutely convergent. If Re(s) > 0, the convergence of geometric series 
allows us to write (1 — p-*)~' = 0 _,p~™*. By taking the product over 


the prime numbers p,,...,p, which are < T, we obtain 
co 
II (1 ae ae = Il @ ye) 
pT p<T \m=0 
= yy ie ae bad 
My, » Mr 2 1 
Pi, Pr ST 
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where we denote by ./(T) the set of integers all of whose prime factors are 
< T. Thus whenever Re(s) > 1, we have 


4-0-4) |-| © 4<El4l-ES 


n=1 p<T new (T) n>T 


The last sum is the tail-end of a real convergent series (when o := Re(s) > 
1) and thus tends to zero, which proves both the convergence of the product 
and Euler’s formula. 


3.5. Corollary. The function ¢(s) does not have any zeros in the open 
half-plane Re(s) > 1. A holomorphic branch of log ¢(s) for Re(s) > 1 can 


be constructed by setting 


logé(s) = S> So Fe aset (4.16) 


p mei 


Furthermore, if we define the von Mangoldt function by 


l ifn =p” 
Atay = og p ag fee 
0 if not, 


then 


(s) -Lvos EP 


(s) p mei n=1 


(4.17) 


Proof. We know 1 — p-* £0 and that the product is convergent. The first 
assertion is therefore obvious. The second formula can be deduced from 
Euler’s formula by taking the series expansion of the logarithm (valid for 
|x| < 1) and summing: 


log ((l—2)7*) = S a” 


The second formula is therefore gotten by differentiating the first. 


Interlude (I). These formulas can be generalized by replacing Z and Q by 
Ox and K, and the uniqueness of the decomposition into prime factors by 
the uniqueness of the decomposition into prime ideals (Theorem 3-4.18). 
We denote by -% the set of non-zero ideals and Ax the set of non-zero 
(maximal) prime ideals in @x. Now we can introduce the Dedekind zeta 
function and prove that 


éx(s):= So ND? = JT (1—N(p)~*)~" for Re(s) > 1 


TE K pEePr 
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3.6. Proposition. The function ¢(s) can be analytically continued to a 
meromorphic function to the half-plane Re(s) > 0, with a unique pole at 
s=1 with residue equal to +1. 


Proof. The statement says that ¢(s) — 1/(s — 1), originally defined for 
Re(s) > 1, has a holomorphic continuation to the half-plane Re(s) > 0. To 
prove this, we can write ¢(s), using the expression |t]| = )> -, 1, as 


¢(s) => = = a [ele tat 


| reat+s | ([¢| — t) 479 dt 
i 1 


= sttits/ (jé] —é)t-? "dk. 
s—1 1 


nxt 


We know that ||¢] — t] < 1. The last integral is hence convergent and 
defines a holomorphic function for Re(s) > 0. 


3.7. Remark. Actually, the function ¢(s) — 1/(s — 1) can be extended to 
the whole complex plane, and moreover, ¢(s) satisfies a functional equation 
(see Theorem 4-5.6 further down). 


4. Characters and Dirichlet’s Theorem 


4.1. Definition. If G is a finite abelian group, a homomorphism from 
G to C® is called a character. The set of characters of G forms a group 


denoted by G. 


4.2. Proposition. The group G is isomorphic to the group G (but not 
canonically). 


Proof. If G = Z/nZ, a character satisfies y(1) € fin (where jin denotes as 
usual the group of nth roots of unity), and the map x + x(1) provides 
an isomorphism between G and Lin. The latter is isomorphic to Z/nZ 
and hence to G. We will now show that eae yes — Gi x Ge to see 
why this is true, a character y of Gi x G2 can be written as y(g1, 92) = 
x(g1, €2)x(e1, g2), and, by setting y1 = x(-, e2) and x2 = y(e1,-), we obtain 
an isomorphism x +> (x1, X2) from eyes to G1 x Gy. The general case 
is now easy: we have G = Z/niZ x --: x Z/n,Z, hence 


GX (Z/mZ xx Z/npL) 
~O/mZx x Z/n,L%Z/ryZ xo xX Z/n LG. 
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4.3. Lemma. /f x is an element of order r in G, then for every €, an 
rth root of unity, there exist |G|/r characters x such that x(x) = €. In 
particular, in the ring of polynomials C/T], the following formula holds: 


[[G-x@m) =a-Tryel". 


xEG 


Proof. We can immediately see that y(x) € pu, since x(x)" = x(a") = 
x(eq) = 1. Consider the map y + x(a) from G to pp, which is a ho- 
momorphism whose kernel we will now identify. Let H be the subgroup 
generated by x (so that H ~ Z/rZ). The kernel of the previous homomor- 
phism consists of characters which satisfy x(a) = 1 and also of characters 
which are trivial on H. The latter are in bijection with the characters of 
G/H, and their cardinality is therefore card(G/H) = |G|/r. We see that 
the image has cardinality r, thus the homomorphism is surjective, which 
completes the proof of the first part of the lemma. For the last formula, it 
suffices to notice that 
|G|/r 


[J @-x@z) =| T[ a-«7) = (1 —TryIGl/r, 


xEG Ep, 


We will essentially use this lemma in the form of the following corollary. 


4.4. Corollary. Let p be a prime number which does not divide n and r 
the order of p modulo n. Consider the set of Dirichlet characters modulo 
n (see below). Then the following formula holds: 


I (1 x(P) yo = (1- A a (4.18) 


x modn P 


4.5. Proposition. If G is a finite commutative group, we have the fol- 
lowing relations: 


vg EG\{e}, So x(g)=0 and = Vx € G\ {1}, 4 x(g) =0. 
xEG geG 


Proof. If g = e, we clearly have >!) -@x(g) = |G|. If g 4 e, there exists, 
by the previous lemma, a character 1 such that y1(g) 4 1, and hence 


S> x(9) = So boa) (9) = x9) SS x(9), 
xEG xXEG xEG 


which gives us the first equality. We will handle the other sum similarly by 
observing that if x = 1, then 7 ,¢qx(g) = |G|, and if x # 1, then there 
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exists g, such that y(g,) 4 1, hence 


S" x(9) = S© x(g91) = x(g1) 55 x(9), 


gEG gEG geEG 


which gives us the second formula. 


4.6. Lemma. Let a€G. Then, 


LX xa)x(2) = ‘6 ae 


Proof. This statement follows from the previous formulas since y(a) is a 
root of unity, so x(a) = x(a)~* = x(a“), and therefore 7 -¢ X(@)x(x) = 


yea X(@ *x) equals |G| if x = a and 0 if not. 


4.7. Definition. Let x : Z/mZ* — C* be a character of Z/mZ*. The 
Dirichlet character modulo m (also denoted by x) is the map from Z to C 
defined by 


x(nmodm) if ged(m,n) = 1, 
x(n) = 
0 if ged(m,n) > 1. 


Remark. We have the multiplicativity property: VYn,n’ € Z, x(nn’) = 
x(n)x(n‘), in other words, the function y is completely multiplicative. 


We will use these characters in the following way: we have the equality (at 
least formally) 


Ss) f@)= aa So So x(a)x(P) f(P) 


p=amodm 
—= 1 1 Y(a é 
en) 2A Bon) > x( (3 x14) 


X#l 


We have already looked at sums like the first term }7,, f(p); to be able to 
deal with sums of the type > FF x(p) f(p), we introduce the following series. 


4.8. Definition. Let y be a Dirichlet character modulo m. We define the 
Dirichlet “L’-series by the following series: 


L(x, 8) = se x(n)n-*. 
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Remark. If yo is the unitary character or principal character modulo m, 
we have x9(n) = 1 or 0 depending on whether n is relatively prime to m 
or not. We can easily deduce from this that L(xo, 8) is almost equal to the 
function ¢(s); to be more precise, 


Lixas)= > n*= [J Q-p*) = [J a-p*)e(s). 


ged(n,m)=1 pm p|m 


4.9. Proposition. The abscissa of convergence of the series L(x, 8) is 
a = 0, except when xy is the unitary character, in which case 0 = 1. 


Proof. We have seen from the previous remark that the series L(xo, s) 
where xo is the unitary character has the same abscissa of convergence as 
the series which defines the zeta function, in other words 1. The terms in 
the series do not tend to 0 if Re(s) < 0, hence the series cannot converge. 
If y is a character modulo m which is not the unitary character, then 
yee x(n) = 0, and hence 


Sox(n)}=| So x(m)] <m. 


nXe m| an | <n<xa 


Furthermore, we have seen (cf. Remark 4-3.3) that the Dirichlet series 
therefore converges when Re(s) > 0. 


4.10. Remark. The abscissa of absolute convergence is 1 and is strictly 
larger than the abscissa of convergence, which is 0 in this case. For any 
Dirichlet series, if we denote by o, its abscissa of convergence and o, its 
abscissa of absolute convergence, we can show that we always have the 
following inequality: 0. < oq <oce+1. 


4.11. Theorem. The generalized Euler formula holds: 


CO 


L(x, 8) = x(n) = II (1 _ xt) i” when Re(s) >1. (4.19) 


s s 
n 
n=1 Pp P 


Proof. When Re(s) > 0, we can consider the following convergent geometric 
series: (1 — x(p)p-*)~* = SOP _o x(p)™p—"™*, and by taking the product 


over the prime numbers p,,...,p, which are < T, we therefore obtain 
co 
[[G-x@e>) t= [I (> xoyre™) 
p<T p<T \m=0 


= So x(pi)™ = x(pe)™ (pt per)? 
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= SS) x(n)n, 


ne (T) 


where -/(T) denotes the integers all of whose prime factors are < T. Thus 
whenever Re(s) > 1, we have 


>» a) -T (.- xt) iF = ss 


p<T n€N(T) 
x(n) 1 
<b /e|<o4 
n>T n>T 


The last sum is the tail-end of a real convergent series (when o := Re(s) > 
1). It therefore tends to zero as T tends to infinity, which proves both the 
convergence of the product and the generalized Euler formula. 


4.12. Corollary. When Re(s) >1, we have L(y, s) £0. 


Proof. This is obvious since the Euler product is convergent and 1 — 
x(p)p-* #0. 


4.13. Corollary. We also have the following formulas. 


log L(x,8) = > S> xy" ome, (4.20) 


p mei 


~ £69) 2 Ty BF =x A@. aay 


Proof. We can use an argument similar to the one we used for the function 


¢(s). 


Interlude (II). Let K = Q(Vd) where d is square-free and 4 1. The de- 
composition law in (x of primes of Z allows us to describe the contribution 
of p to the Dedekind function ¢x(s): 


tap) (>) p-*)~! if pis split in K, 

(l—p-*8) t=(1-p-s)(1- (4) p*)~+ if pis inert in K, 

(—p-*)2=(-p-*)10- (5) p~*)- if p is ramified in K, 
for odd p (there is a similar statement for p = 2). We therefore have that 


Cx(s) = C(s)L(xa, 8), 
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where yx is the character defined by xa(p) = (4) for odd p and xq(2) = 1 
(resp. ¥q(2) = —1, va(2) = 0) if d= 1mod8 (resp. d = 5mod8, d= 2 or 
3mod4). It can be shown as an exercise that if D := |d| for d = 1mod4 
(resp. D := 4|d| for d = 2 or 3mod 4), then vq is a character modulo D. 
We will prove below that L(ya,1) 4 0, and therefore the function ¢x(s) 
has, just like ¢(s), a pole of order 1 at s = 1, at which the residue equals 
L(va,1). One of the nicest results in analysis, the class number formula for 
a quadratic field, is given by: 
2thK 
_) wV/D 
Res(Cx, 1) = 1 ong loge 
VD 
where hx is the class number, w the number of roots of unity (equal to 2 if 
d < —4, and equal to 4 or 6 if d= —1 or d= —3) and « is the fundamental 


unit > 1 (the generator of Ox modulo +1). This formula, together with the 
explicit computation of L(y,1) (see Exercise 4-6.6) is very useful, namely 


if K is imaginary, 


if K is real, 


for studying hx. 


For the proof of the theorem on arithmetic progressions, we will need the 
following key result. 


4.14. Theorem. Let x be a Dirichlet character different from the unitary 
character. Then, 


L(x, 1) #0. 


4.15. Remark. If we knew that the Euler product converged at s = 1, we 
would immediately have this result since 1 — y(p)p~' 4 0. Luckily we can 
show, by using the fact that L(y, 1) is non-zero, that the Euler product at 
1 converges (see Exercise 4-6.7). 


Before proving the theorem, we will see how to deduce Dirichlet’s theorem 
(4-1.4) from it. 
Proof. For Re(s) > 1, we write the formula as 


ee, a 64. cil = 2, 
. Poa Hm) be: + hy Dro (Drow 


p=amodm xA1 


The generalized Euler formula then yields for Re(s) > 1/2, 


y x(p)p * = log L(y, s) + a holomorphic function. 
P 
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xe) 


pm>2— my PS ”* | is bounded above by )7,, ms2P "?/m, 


The expression = 
which converges when o > 1/2. Consequently, when y is not the unitary 
character and knowing that L(x, 1) # 0, we can deduce that >7,, x(p)p* = 
O(1) in a neighborhood of s = 1. However, >7,,p-* = — log(s — 1) + O(1). 
We can therefore conclude that 


yp t= core 2 + O0(1), 


which indeed proves the theorem on arithmetic progressions. 


p=amodm 


4.16. Remark. If 2 designates a subset of the set Y of prime numbers, 
we can define various notions of density. The preceding proof suggests that 
we should introduce the notion of analytic density: 


pEP 


Thus we have just shown that the analytic density of prime numbers con- 
gruent to a modulo m is 1/¢(m). We could also define the “natural” density 


x} 


It can be shown, but we will not do it, that the natural density of prime 


as 


d 
d(Q2) := lim SEO Ge iP 
aco card{p€ P| p 


IN| MN 


numbers congruent to a modulo m is 1/¢(m). 


To prove that L(y,1) 4 0, we use the following lemma about Dirichlet 
series with positive real coefficients. 


4.17. Lemma. Let (a,,) be a sequence of positive real numbers. Suppose 
that the series F(s) =>, converges for Re(s) > oo and that the 
function can be analytically continued in a neighborhood of a9. Then the 


aR? 
abscissa of convergence of the series which defines F'(s) is strictly less than 
Oo. 


Proof. Choose r > 0 and 0 < 09 < 01 so that o € D(o1,1), where this disk 
is contained in the domain of holomorphy of F'(s). The point oj is in the 
half-plane of convergence, hence 
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domain where F'(s) 
is holomorphic 


aa (71) = De te (—logn)*¥n-%. 


By writing the expansion of F as a power series in the disk D(o1,r) at the 
point o, we obtain: 


2 pl) (g, 
F(o)= S- EA) Cee, 


k=0 n=1 
le) 1 lo) 

= S- a S- dn (log n)*n~7! (a1 — @) 
k=0 n=1 
ie fond 

— San iS zz flog)" (or —o) 
n= k=0 

- S- ann 7 exp (log n(a1 — c)) 

= ayn 71n%~? 

= anne 2. 


where the rearrangement of the order of summation is justified by the fact 
that the terms are positive. This shows that the series converges at o. 


4.18. Lemma. Let G be the set of characters modulo m and p a prime 
number which does not divide m. We denote by f, the order of pmodm 
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and gp := o(m)/fp. Then we have the identity: 
[] @-x@)r) =(1-T%)”. 
xeG 


Proof. This follows from Lemma 4-4.3 on the values of characters at a 


point. 


4.19. Corollary. The function F(s) := [],<q L(x, §) ts a Dirichlet series 
with positive coefficients in the half-plane Re(s) > 1 and has a simple pole 
ats=1. 


Proof. To prove the first statement, we compute 


Hs) HI (1- <2) 


pi{m x 
=p el Ip 
= II (.- 1 = II (dor) , 
pim e pm \r=0 


which is clearly a Dirichlet series with positive coefficients. By furthermore 
noticing that g, > 1 and f, < ¢(m), we have, for o € R, 


[4% = I] (Serre) oT (1+ p77) 


pym \r=0 pym 
Thus the series and the product diverge for o = 1/¢(m). 


Furthermore, the function L(yo, s), like ¢(s), is meromorphic on Re(s) > 0, 
with a unique simple pole at s = 1. The other L(y,s) are holomorphic 
on Re(s) > 0, hence the product of these functions is meromorphic on 
Re(s) > 0, with a simple pole at s = 1 if [],., L(x,1) 40 and no poles 
if one of the L(y, 1) is zero. We will now show that the latter case cannot 
happen. To see why this is true, if the product function were holomorphic 
up to Re(s) > 0, the abscissa of convergence would therefore be < 0 in 
light of the lemma about Dirichlet series with positive coefficients (Lemma 
4-4.17), which is a contradiction. 


We can deduce from the previous argument that L(x, 1) is non-zero for 
every x different from the unitary character, and therefore we have indeed 
finished the proof of Dirichlet’s theorem on arithmetic progressions. 


4.20. Remark. We can show (up to factors corresponding to prime 
numbers p which divide m) that the product [], L(x, 7) is equal to the 
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Dedekind zeta function of the field Q(exp(2mi/m)), which explains in a 
more conceptual manner why the coefficients are positive. 


5. The Prime Number Theorem 


We will prove the following form of the prime number theorem. 
5.1. Theorem. The integral [,*(0(t) — t)t~dt is convergent. 


We will show that the convergence of this integral implies 0(7) ~ «x, and 
hence (x) ~ x/logz. To see this, suppose that limsup 0(x)a~! > 1, then 
there exists « > 0 and x,, tending to infinity such that 0(7,)a;' >1+e. 
For t € [a#,, (1 + €/2)x,], we therefore have 


A(t) —t > O(an) — (1+ €/2)an > €Xn,/2 & EL, . 
i ig i 2(1 + €/2)?a? 


and consequently 


(1+e/2)an 0(t) 2:8 2 
/ z dt > - 2? 
In t 4(1 + €/2) 
which contradicts the convergence of the integral. We can therefore con- 


clude that limsup@(x)a~' < 1. A symmetric argument shows that 
lim inf 0(x)a~1 > 1. It follows that lim @(x)a~+ = 1. 


To prove the theorem, we will use the following result from complex analysis 
(due to Newman, see [55, 81]) concerning the Laplace transform. 


5.2. Theorem. (“The analytic theorem”) Let h(t) be a bounded, piecewise 
continuous function. Then the integral 


F(s) = | oa h(u)e~s*du 


is convergent and defines a holomorphic function on the half-plane Re(s) > 
0. Suppose that this function can be analytically continued to a holomorphic 
function on the closed half-plane Re(s) > 0. Then the integral converges 
for s=0 and 


+00 
F(0) = sh h(u)du. 


Let us provisionally admit that this result is true and see how to apply it 
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to the function 


+oo 6 U\) _ pu +co 
= / es ae e“du = | [O(e")e~* — 1] e “du. 
0 ) 0 


The function h(u) := 0(e")e~“ — 1 is indeed bounded and piecewise con- 
tinuous. If the analytic continuation hypothesis is satisfied, then we know 


that F(0) = A (A(e")e~" —1)du = ee MO at is indeed convergent. 


We could transform the integral which defines F'(s) (for Re(s) > 0) as 


follows: 
TO Ot) —t 
love) n+l1 +00 
= / aqey-e tat — f ae 
n=1"" 1 
co n st me (n+ a 1 
= O(n) Fea 5 
n=1 
= 1 . —s—l1 -_ = = 1 
= sq mee (A(n) —A(n-1))- = 


We have also seen that 


Sts) _  tog(p\p-™ = Soleg(pp-* + ST loz(pp-™. 


(8) _ 
C(s) oes 


p,m21 
The second term in the last expression is a convergent series and hence holo- 
morphic for Re(s) > 1/2. From this, we can deduce that }/,, log(p)p~* = 
C'(8) 
¢(s) 


+ a holomorphic function on Re(s) > 1/2 and finally that 


/ 
1 
F(s)= Ca u + a holomorphic function on Re(s) > —1/2. 
(st1)¢(s+1) 4 


The key point in the proof is therefore the following result. 


5.3. Theorem. (Hadamard, de la Vallée-Poussin) The function ¢(s) does 
not have any zeros on the line Re(s) = 1. 
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Proof. We start with the formula 
4 cos(x) + cos(2x) + 3 = 2(1 + cos(zx))? > 0. 
Recall that 


log |¢(o + 2t)| = > a cos(mt log p). 


p,m 


This implies that 
log (|¢(a + it)|*]¢(o + 2it)|¢(a)*) 


= S- £ mo (4cos(mt log p) + cos(2mt log p) + 3) > 0. 
pymel 


We can conclude from this, assuming o > 1, that 
IC(o + it) |*\C(o + 2it)|C(o)? > 1. (4.22) 


Now, if ¢(s) had a zero of order k at 1+ it and with order @ at 1 4 2it, 
then |¢(o + it)| ~ a(o — 1)*, |¢(o + 2it)| ~ b(o —1)* and ¢(¢) ~ (a —1)71 
(where o tends to 1 from above). The left-hand side of inequality (4.22) is 
therefore (asymptotically) equivalent to c(a —1)**+*~3, which implies that 
4k + €—3 <0, and hence k = 0. 


5.4. Corollary. The function defined on Re(s) > 1 by 


¢'(s) 1 
s((s) 5-1 


extends to a holomorphic function on Re(s) > 1. 


G(s) := 


Proof. The previous theorem shows that the function ¢’(s)/¢(s) is holo- 
morphic on the line Re(s) = 1, except for s = 1. Consequently, the 
function G(s) also is. To study G(s) in a neighborhood of s = 1, we 
use the fact that ¢(s) has a simple pole at s = 1, and consequently 
¢’(s)/¢(s) = —1/(s — 1) + g(s), where g(s) is holomorphic in a neigh- 
borhood of 1. Thus G(s) is indeed holomorphic in a neighborhood of 1 and 
hence on the line Re(s) = 1. 


Appendix. Proof of the “analytic theorem” 


Recall the statement of the analytic result used in the proof of the prime 
number theorem. 
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5.5. Theorem. [f h(t) is a bounded piecewise continuous function, then 
the integral (the Laplace transform of h) 


+00 
F(s) = if h(uje”*“du 


is convergent and defines a function which is holomorphic on the half-plane 
Re(s) > 0. Suppose that this function can be analytically continued to a 
holomorphic function on the closed half-plane Re(s) > 0. Then the integral 
for s =0 converges and 


+o0o 
F(0)= | h(u)du. 


Proof. The first part is analogous to the theorem of convergence for Dirich- 
let series (see Exercise 4-6.2). We will therefore prove the second state- 
ment. For a (large) real number T, let Fy(s) := fo h(t)e~*‘dt; these 
are functions which are holomorphic for all s € C. We now need to 
show that limr_,. Fr(0) exists and equals F'(0). To do this, we con- 
sider for some large R the contour y = y(R,6) which bounds the region 
S:={s €C| Re(z) > —d and |s| < R}. Once we have fixed R, we can 
choose 6 > 0 sufficiently small so that F'(s) is analytic on this region. 


open subset where 
F(s) is analytic 


—R! =i R 


The trick lies in introducing the function 
sT 8? 
Gr(s) := (F(s) — Fr(s))e 1+—>}, 


so that G7(0) = F(0)—F (0). Therefore, everything comes back to proving 
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that limp_... Gr(0) = 0. To do this, we will use the residue theorem a first 
time, noticing that 


271 


Gr(0) = F(0) — Fp(0) = | (F(s) — Fr(s)) ee? (: i: +) ds, 


To find an upper bound on this integral, we cut the contour into two pieces: 
1, which is the piece of - which lives in the half-plane Re(s) > 0, and 72, 
which lives in the half-plane Re(s) < 0. We then carry out the following 
computation. 


Let s be a number such that |s| = R or s = Re’’. Then we have 


We also have the upper bound 
|F(s) — Fr(s)| = (eat < uf | ol ae Me 
This gives us 
a (F(s) — Fr(s)) &? (1+ =) dae M. 


Thus assuming that R is very large, this part of the integral will be ar- 
bitrarily small. Now, cut the integral over y2 into two pieces, J, and Iz, 


1 r ds 
i F(s)e* (14+) aaa 
y2 R? 7 


pi acl Bas. 
i= 5 gee 7(1+ 25) : 


To find an upper bound on Jy, observe first that F(s) is entire. The 
residue theorem (or actually the Cauchy formula in this case) allows us 
to then replace the contour yz by the arc of a circle of radius R which 
lives in the half-plane Re(s) < 0 and, by using the same upper bounds, to 
conclude that |I2] < M/R. To find an upper bound of l;, simply notice 


where 


2 
that the function F'(s)es” (1 + =) = converges to 0 when T’ tends to 


+oo and converges uniformly on every compact set contained in Re(s) < 0. 
Consequently, 
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By putting the three upper bounds together, we see that 


Fr(0) — F(0)| < 25" +e() 


where ¢(T7') tends to zero (in a way dependent on R). We needed to show 
that lim (0) = F'(0), which is now accomplished. 


Supplement. Analytic continuation and the functional equation 


We will now outline the main steps of the proof of the following theorem 
due to Riemann. 


5.6. Theorem. (The functional equation of the Riemann zeta function) 
The function ¢(s)—1/(s—1) can be analytically continued to the whole com- 
plex plane. Furthermore, the function ¢(s) satisfies the functional equation 
given by 

&(s) = €(1— s), (4.23) 


where €(s) := n~8/?T'(s/2)¢(s). 


As a preliminary, we will recall the construction of the function I'(s) and 
the Poisson formula which gives the functional equation for the theta series. 


5.7. Lemma. The integral T'(s) := ih e'ts~1dt defines a holomorphic 
function for Re(s) > 0, which satisfies the functional equation T'(s +1) = 
sI'(s). It can be continued to all of C as a meromorphic function with 
simple poles at 0,—1,—2,—3,.... 


Proof. Showing that the integral is convergent does not pose any problems. 
The functional equation can be obtained by integrating by parts. The 
functional equation also allows us to analytically continue by induction from 
Re(s) > —n to Re(s) > —n — 1 by using the fact that ['(s) = s~'T'(s + 1). 
Finally, the expression 


_ 1 
a= s(s+1)---(s+n) pr rren 


makes it clear where the poles are. 


We can also prove that for all s, ['(s) 4 0 (see Exercise 4-6.19). 


5.8. Lemma. (Poisson formula) Let f(x) be an integrable function over 
R (i.e., in L'(R)). We define its Fourier transform by 


a “Foo 
fw) = / f(x) exp(2miny)de 


—Co 
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and assume that the function >) ,¢z f(x +n) is of bounded variation on 
(0, 1] and continuous. Then the following formula holds: 


Yo f(r) = SO fm). (4.24) 


neZ mez 


Proof. We introduce the function G(«) := >) ez f(a +n) (the hypotheses 
guarantee the existence and continuity of such a function), which is clearly 
a periodic function. Dirichlet’s theorem on Fourier series allows us to write 
its Fourier expansion as 


=) ol G(m) exp(2rimz), 


mez 
where the Fourier coefficients can be calculated as follows: 


jae G(t) exp(—2rimt)dt = Sf a f(t + n) exp(—27imt)dt 


neZ 
+00 A 
= / f(x) exp(—2riam)dx = f(—m). 


This gives 
» f(t@tn)= se f(m) exp(—2rimz). 


neZ mez 


The Poisson formula follows from that by taking x = 0. 


This formula is most often applied to a function f which is continuously 
differentiable and fast decreasing (i.e., f(x) = O(\a|~™) for all M), and 
therefore the function G is itself continuously differentiable. This is the 
case when applying the formula to the following “theta” function. 


5.9. Corollary. The function? 0(u) := Yonez &XP (—7un?) satisfies the 
functional equation for allu € R4 given by: 


O(1/u) = Jud(u). (4.25) 
Proof. It suffices to apply the Poisson formula to the function f(#) = 
exp(—muz?) and to verify that f(y) = exp(—my?/u) /Ju. 


Proof. (of Theorem 4-5.6) We start with the following computation (where 


2We hope that the context will allow the reader to distinguish this function from the 
Tchebychev function 0(x) = >7.,<, log p. 
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we introduce t = 1n?u) which is valid for Re(s) > 


€(s) = nr 8/27 (s/2)¢ J->f en tts) 2a —8/2y —s a 


nol 


=| S © exp(—mun?) ge ae 
0 


n>1 


= [ay te 
0 
_ Ou) = 


=e exp ( —run? 5 


n2>1 


where 


Let us point out that 6(u) = O(exp(—7u)) when u tends to infinity and 
that the functional equation of the function @ can be translated into 


6(+) = Vudu) + 5 (vu-1), (4.26) 


By using the simple computation [,* t~* = 1/(s — 1) and the functional 
equation of the theta function (4.25), we obtain 


1 8/2 OM. 8/2] 
U U U U 
&(s) -| i) +f A(u) 
a —s/2 OF cs 8/2 
=f Gaye + [iy 


We have a priori obtained the desired expression only for Re(s) > 1, 
but we can easily see that the integral defines an entire function since 
6(u) = O(exp(—7u)) and since it is symmetric under the transformation 
sreol—s. 


Supplement without proofs 


1) To establish the prime number theorem, we could, in the place of the “an- 
alytic theorem”, use Ikehara’s theorem [40] (sometimes called the Ikehara- 
Wiener theorem), which is more powerful but also more tricky to prove. We 
will settle with stating the theorem. Its extension to the case of a multiple 
pole was proven by Delange [25]. 
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5.10. Theorem. (Ikehara) Let A(t) be an increasing function such that 


the integral F'(s) = a A(t)t~*~ dt is convergent whenever Re(s) > 1 and 
can be analytically continued to the line Re(s) = 1 except for a simple pole 
at s =1 with residue X (in other words, the function Fs) — X/(s — 1) can 


be analytically continued to Re(s) > 1). Then, 
A(x) = Ax + O(a). 


If X =0 (in other words, if there is not a pole at s = 1), then A(x) = o(2), 
if not A(z) ~ Ax. More generally (Delange), if F(s) can be analytically 
continued to the line Re(s) = 1 with a pole of order t at s = 1 and principal 


term equal to \/(s —1)', then A(x) ~ opr ites mes 


The prime number theorem follows from this theorem by using the fact 
that the hypotheses are satisfied when A(x) = w(x), since 
y +co 
SU) og | eta. 
¢(s) 1 


There are other paths or variations to arrive at the prime number theorem: 
see, for example, the proofs found in [18], [41] and [72]. Besides these, we 
would like to bring to your attention the proofs found in [4], [23], [51] and 
[53], which rely on “elementary” methods (not using a complex variable). 
The first elementary proof (1949) is due to Erdés and Selberg. 


2) The result on the non-vanishing of the ¢ function on the line Re(s) = 1 
could be considerably stronger, at least conjecturally. First of all, knowing 
that ¢(s) does not vanish in Re(s) > 1, we can deduce from the functional 
equation that, in the closed half-plane Re(s) < 0, the function ¢(s) vanishes 
uniquely at the points s = —2, —4, —6,..., with order equal to one. To see 
why this is true, the function €(s) = 1~*/?T'(s/2)¢(s) does not vanish for 
Re(s) > 1 (and has a simple pole at s = 1) hence, by the functional 
equation, it does not vanish for Re(s) < 0 (and has a simple pole at s = 0). 
Furthermore, the function I'(s/2) never vanishes (see Exercise 4-6.19) and 
has a simple pole at s = —2n, for n € N (see Lemma 4-5.7), hence ¢(s) 
should vanish at s = —2n, for n > 1. The question of describing the zeros 
in the critical strip, in other words in the strip 0 < Re(s) < 1, is much 
more delicate. The functional equation implies that the zeros are situated 


symmetrically with respect to the line Re(s) = + 
Riemann, in his extraordinary essay [60], suggested that the zeros are all 
situated on the line of symmetry. 


“Man Findet nun der That etwa so viel reelle Wurzeln innerhalb dieser 
Grenzen, und es ist sehr wahrscheinlich, dass alle Wurzeln reell sind. Hi- 
ervon ware allerdings ein strenger Beweis zu wtinschen; ich habe indess 
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die Aufsuchung desselben nach einigen fliichtigen vergeblichen Versuchen 
vorldufig ber Seite gelassen, da er ftir den nadchsten Zweck meiner Unter- 
suchung entbehrlich schien.”° 


Keeping the previous notation, this can be formulated as follows. 


5.11. Conjecture. “Riemann hypothesis’* Let s € C where Re(s) > 1/2, 
then ¢(s) # 0. In other words, if 0 < Re(s) < 1 and ¢(s) = 0, then 


If this result proves to be true, it would follow that for every a > 1/2, 
p(x) = «+O(«%) and O(x) = x+O(a«%). By using the formula gotten above 
via Abel’s formula, we can also deduce a much more precise equivalence for 


m(ax): 


A(x)  f® oleae 
me ) f fl 


~ log (a (logt)? , 


This formula can be transformed into 


a”) | bee dete i illogt)? 


By introducing the “logarithmic integral function”: 


; * dt 
LD — = 
i(a) | fe 


we can see that the Riemann hypothesis implies that m(a) = Li(x) + O(a2%) 
for every a > 1/2. By observing that 


; 1 x x 
Li(z) = —2— + + O : 
(7) logt = 2 (loga)? ( (log x)? 
we can see that Li(a) constitutes an estimate which is much more precise 


than x/log(a). Alas, the best proven result is far from our hopes, but we 
nonetheless know how to prove statements such as 


a(x) = Li(x) +O (x exp (-cViog) ) : 


Because the zeta function is intimately linked to prime numbers, we can 


3“One finds, indeed, approximately so many real roots between these limits, and it 
is very likely that all of the roots are real. Certainly, a strict proof thereof needs to 
be done; I have however left aside the exploration of this question after some fleeting 
attempts in vain, since it seemed to be not essential for my current research objectives.” 

4The Riemann hypothesis is one of the major open problems in mathematics; the 
Clay Mathematics Institute also offers a million dollars for its solution. 
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reinterpret the functional equation and the Riemann hypothesis to be the 
expression of a higher order symmetry which brings a mysterious balance 
to the apparent chaos of the distribution of prime numbers. It provides 
in some sense an answer to Euler’s thought with which we will close this 
chapter.° 


“Les mathématiciens ont taéché jusqu’ici en vain @ découvrir un ordre quel- 
conque dans la progression des nombres premiers, et on a lieu de crotre 
que c’est un mystére auquel l’esprit humain ne saurait jamais pénétrer. 
Pour s’en convaincre, on n’a qu’a jeter les yeux sur les tables des nombres 
premiers, que quelques personnes se sont donné la peine de continuer au- 
dela de cent-mille: et on s’apercevra d’abord qu’il n’y régne aucun ordre ni 
régle.”® 


6. Exercises 


6.1. Exercise. Prove that Euclid’s argument showing that the set of 
prime numbers is not finite implies that py < 22". Deduce from this the 
lower bound m(a) > loglog« for x > 2. 


6.2. Exercise. Let h : [(0,+00) — C be a piecewise continuous func- 
tion (or locally integrable). Prove that the Laplace transform F(s) := 
ee h(uje—“Sdu is convergent in the half-plane Re(s) > oo and defines 
a holomorphic function there. 


Hint. You could use a procedure analogous to Proposition 4-3.1; more 
generally, you could extend the statement to functions of the type F'(s) = 


Jr, e “Sdy(u). 


6.3. Exercise. Recall that (f*g)(n) = Vian f(Dg(n/d). Prove that if the 
two Dirichlet series F(s) = S-7—_, f(n)n~* and G(s) = SY, g(n)n~* con- 
verge absolutely for Re(s) > do, then in the same half-plane, the following 


5Leonhard Euler: citation taken from his article Découverte d’une loi tout extraor- 
dinaire des nombres par rapport @ la somme de leurs diviseurs, Bibliothéque impar- 
tiale 3, 1751, 10-31. The citation reproduced here can also be found in the reissue 
of the article from Opera Posthuma 1, 1862, 76-84 and is available on the website: 
http://math.dartmouth.edu/~euler. 

6“ Mathematicians have tried, so far in vain, to discover some order in the progression 
of prime numbers, and we are led to believe that it is a mystery which the human mind 
will never know how to penetrate. To be convinced of this, we only have to have a look 
at the tables of prime numbers, which some people have taken the pain to continue to 
more than one-hundred thousand: and we realize right away that neither order nor rule 
prevails there.” 
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equations hold: 


F(s)G(s) = » soon) (>: don) = SOF * 9)(n)n-*. 


In particular, prove that for Re(s) > 1, we have 


(8) = So u(n)n-*, 


n=1 


where 4s is the arithmetic Mobius function (i.e., w(1) = 1, w(pi--:pr) = 
(—1)* and u(n) =0 ifn has a square factor). 


6.4. Exercise. (Mébius inversion formula) From among the arithmetic 
functions from N \ {0} to C, we define the function 6 by 6(n) = 0, except 
6(1) = 1 and the function 1 by 1(n) = 1, for all n. Prove that 6 is the 
identity element for the product * and that ux1= 6. Deduce from this the 
Mobius inversion formula: 


gn)=S>F@ fF (r)= YF u(Dg(n/d). 
d| 


d|n 


6.5. Exercise. (Second Mobius inversion formula) Let F and G be two 
functions of a positive real variable. Prove that if G(a) = 0 ,<, F (=), 


then F(x) = Vince H(n)G (=). 


n<xx 


6.6. Exercise. Let x be a nontrivial Dirichlet character modulo N. In 
this exercise, you are asked to give a finite explicit formula for L(x,1). 


a) Let L() := 30,51 exp(iné)n~*. By using the complex logarithm, prove 
that if 0 €]0,27|, then 


L(6) = —log(2sin(6/2)) +i (4 = $) 


b) Prove that 


N-1 
L(x,1) = —G(x)~? > X(w) log sin () 


and if x is odd (i.e., x(—1) = —1), then 
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N-1 
L(x,1) = Ar X(u)u. 
NG(x) X, 
d) Let x be a character modulo 4 such that x(—1) = -1. Verify that 


L(x, 1) = 1/2V2. Let x’ be a character modulo 5 such that y(—1) = 1, 
x(2) = x(3) =—1. Verify that L(y’,1) =logn/V5 where 


_ sin(2m/5)sin(37/5) | 14 V5_ 
sin(7/5) sin(47/5) 2 


We point out that n is the fundamental unit of the field Q(V/5) and that 


this formula is a particular case of the class number formula. 


6.7. Exercise. Let y be a nontrivial Dirichlet modulo N. 
a) Prove that Vi ncy x(m)m-* = L(x,1)+ O(Y~?). 


b) By using the formula logn = °,, in A(m) (which you should first verify), 
prove that 


So Xow po gy Se XMAL) 5 OV, 


N<ux MRE 


A 
c) By using the fact that L(y,1) 4 0, prove that ree EAU is 


bounded when x tends to infinity, and deduce the convergence of the series 
x(P) 
Dy pe from this. 


d) Finally, deduce that the Euler product ]],,(1 — x(p)p—+)~+ is convergent 
and equals L(x, 1). 


6.8. Exercise. We denote by p\,p2,p3,... the increasing sequence of 
prime numbers. 


a) Prove that 
N 
N2 
n=1 


b) Deduce from this a function equivalent to the sum Dopck p, when X 
tends to infinity. 


6.9. Exercise. Let d be an odd integer. We define 
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a) Determine the abscissa of convergence of the series, and prove that the 
following equation holds: 


mato) =TH(-(4)-*) 


P 
b) Prove that the function ¢(s)La(s) can be written as a Dirichlet series 


Co em 
> al where ay > 0. 


6.10. Exercise. Prove the Euler-Maclaurin formula, which generalizes 
Abel’s formula (Lemma 4-1.9): 


Tr 7 _4\k+1 
S sin) =f seoae + > EPs (0006) — 70(0) 


a<ndb k=0 ( 


where by = ay and the functions By, are defined ont € [0,1[ by Bo(t) = 
1, By (t) = kBy_1(t) and ii B,,(t)dt = 0, and then extended by periodicity. 
Hint.— The case k = 0 is Abel’s formula, and for k > 0, proceed by integra- 


tion by parts and induction. 


; — jj | 1 | | 1 
6.11. Exercise. Let y = limp_..(14 g trot a 


logn) be Euler’s 


constant. Prove that 
: 1 _ 
lim {6(8) - Sy} =7- 


Hint.— You could attempt a direct computation or compare the formula from 
Proposition 4-3.6, which gives a continuation of the ¢(s) function, to the 
expression for y from application 1) of Lemma 4-1.9. 


6.12. Exercise. Let f(n) := lem(1,2,3,...,n). Prove that the prime 
number theorem implies log f(n) = n+ o(n). 


6.13. Exercise. We define the following arithmetic function: 


y(n) = Be see lcm (mi, actos ,Mr) ’ 
where the m; are integers > 1. The integer y(n) represents the maximal 
order of an element in the permutation group on n letters. This is mainly 
why we are interested in the function n +> y(n). In this exercise you are 
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asked to prove that 


] 
lim Jog a(n) 1. 
nce \/nlogn 
a) Show that you can write 
Yn)e= max, (pt Ps*), 


where the p; are distinct primes. 
b) By using the inequality of arithmetic and geometric means: 


feee fp 
Wirth, 


prove that if pS ++---+p?* <n, then py? ++ po" < (n/r)’. 


c) By using the prime number theorem, prove that the sum of the r first 

ana 
2 2 

referring back to the previous question, that n > a logr(1 + o(1)), and 


prime numbers is asymptotically equivalent to logr. Deduce from this, 


hence r < 2,/n/logn(1 4+ o(1)). 


d) By observing that the function f(a) = (n/x)®* is increasing on the inter- 
val [1,n/e], conclude that 


log y(n) < V/nlog n(1 + o(1)). 


e) For a given (large) n, we can choose r = r(n) to be the largest integer 
such that py +--+. +p, <n, where (p;) denotes the sequence of prime 
numbers ordered increasingly. Prove that r is asymptotically equivalent to 


2,/n/logn. Prove that log y(n) > O(p,), and finish the exercise by again 


using the prime number theorem. 


6.14. Exercise. Recall that an arithmetic function is multiplicative (resp. 
completely multiplicative) if f(mn) = f(m)f(n) whenever gcd(m,n) = 1 
(resp. for all m,n). 


a) If f is a multiplicative arithmetic function, prove that 


- io =-[[> Ca 


n21 p m=0 P 


b) If f is a completely multiplicative arithmetic function, prove that 


3 =I(1- f() ) 


n>1 p Pp 
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6.15. Exercise. For n € N*, we define the arithmetic function “number 
of integer divisors”: 


T(n) = yd = card{(d,e) € N? | de = n}, 
d|n 
and you are asked in this exercise to study some of its properties. 


a) Prove the identity 


ye tee 


n 


n=1 
b) Prove that ifn = p{* ---py*, then T(n) = (a, +1)--- (ax +1), and deduce 
from this that lim inf r(n) = 2. 
c) Prove that, on average, T(n) equals log n in the following sense: 


> T(n) = S- \+| ~ X log X. 


n<x d<xX 


d) We set P(x) := JJ 


p. By using the prime number theorem, prove 


PKu 
that: 
_ logr (P(a)) log log P(a) 
lim = log2. 
&—00 log P(x) 
You are now asked to show that 
l log 1 
a := limsup Sen) leglogn: log 2. (*) 
n—-0o log n 


To do this, ifn = pi)... py* € N*, then we divide t(n) = D, D2 into two 
pieces in the following manner. We choose a real number M > 2 which 
depends on n, and we set hh := {1 <icr|p< Mh, hb ={1l<i< 
r| pi > M}, Di = Ties, (@i +1) and Dg := [[je7,(ai + 1). (Let us point 
out that if I; =, then D; =1.) 

e) Prove that Dz < Qriet, % < Qlogn/log M_ 

f) Prove that there exists c > 0 (independent of n and M) such that Dy < 
exp (cM log log n/log M). 

Hint. You could prove and use that a; < logn/log2 and card(I,) < 7(M). 
g) By choosing M := logn/(log log n)? in the preceding questions, find an 
upper bound of T(n), and then conclude that equation (*) holds. 


6.16. Exercise. Let k > 1 be an integer. We keep the notation r(n) for 
the function defined in the previous exercise. 
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a) Prove the identity 


m\kmm kmm 
r(p™)Pr™ = S (m+ 1)Pr™ = —* _, 
m=0 2! (1—7)*** 


where Py is a polynomial of degree k—1 defined by P\(T) = 1 and Pyii(T) = 
P,(T)(1+ kT) + P!(T)T(1 —T). 


b) Prove that the polynomial P, can be written as P,(T) = 1+ (2* —k—- 
1)T+---+T*-1; deduce from this that the Euler product 


Gels) =] Q-p)? Be) 


defines a holomorphic function in the half-plane Re(s) > 1/2, and verify 
that the following identity is true: 


Co 


S-7(n)kn-* = ¢(s)”" G(s). 


n=1 


c) By using Ikehara’s theorem (Theorem 4-5.10), deduce from the previous 
question that the following estimate holds: 


pa T(n)¥ ~ Apar(log a)? 1, 


nXx 


where Ax := G,(1)/(2* — 1)!. 
6.17. Exercise. We introduce the following arithmetic function: 
T,(n) := card {(ny,...,n~) EN* |n=n1--- nx}. 


1) Prove that tT, is multiplicative and that 


(mphal) (mel) fmtk—=1 
(k—1)! ( k-1 ) 


Tk(p”) = 


2) Prove that Tk41() = dian Tk(d), and deduce from this the equality 


7 EO = cia 


3) By using the generalized (by Delange) theorem of Ikehara, prove that 


Do ten) ~ TAG (logs) 


nXx 


4) Observe that Ty(p) = k, and, by imitating the steps in Exercise 4-6.15, 
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prove that 

: log Tz (n) log log n 

lim sup 
n—0o log nr 


= logk. (6) 


Hint. You could proceed in a similar manner and replace the inequality 
(m+k—1)---(m+1) 
(k—1)! 


interpretation of the inequality. 


m+1< 2™ by 


< k™ then find a combinatorial 


6.18. Exercise. Recall that ¢(n) = card(Z/nZ)* denotes the Euler totient 
= 1 

and that d(n) =n[[ pin (1 _ z). 

1) For every n > 2, check that d(n) <n—-1. 


1 ; = 
2) Let P(x) := [Tce (1 - zt); by comparing log P(x) to di ,<yP 1 prove 
that there exists a constant Co > 0 such that 


Pla) ~ oe. 
log x 
3) Let N := ee p. By using the prime number theorem and the previous 
question, prove that: 
CoN 
N) ~ ———_. 
aM) log log N 


4) Let py < po <p3 <... be the increasing sequence of prime numbers. For 
n > 2, we denote by w(n) the arithmetic function which denotes the number 
of distinct prime numbers which divide n. Prove that py +++ Dun) <n, and 
deduce that there exists a constant c > 0 such that 


clogn 
gen, 

Byiys log log n 
5) Now let n > 2. Prove that 
w(n) 

Ul (1 — ) o(n) 
Pk ~s n ? 

k=1 


and deduce from this an estimate of the form 


Con _(1 + ¢(n)) < d(n), 


Vn > 2, ee 
log log n 


where lim e(n) = 0." 


7One could show that in fact Co = e~7 where ¥ is Euler’s constant. 
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6.19. Exercise. 1) Prove that the sequence 


2(z+1)---(2+n)n-* ; 
Gn(z) = converges uniformly on every compact 


set in the complex plane and therefore defines an entire function G(z) := 

limp gn(z), which only has simple zeros at z = 0,—1,—2,...,—n, ete. 

2) Verify that G satisfies the formulas G(z +1) = G(z)/z and 
_ 2 sin(7z) 

G(z)G(l—z) =z |] (1 =) =— 


n=1 nm 


Hint. The second equality is well-known and can be shown by comparing 
the zeros of the two functions. 


3) Deduce from this that U(z) := T'(z)G(z) is periodic with period 1 and 
satisfies U(z)U(z — 1) =1. 


Hint. You could use the “reflection formula” given by 


ee oS : 


which is usually proven using the formula 


T(«)T(y) =re+y [ (1 — 8)? at, 


You could either prove this formula or consult a real and complex variable 
analysis text. 


4) Deduce from this that U(z), and consequently T(z), does not vanish 
anywhere in the complex plane. 


Let us point out that we could compute even further and prove that U =1 
and G(z) =T(z)71, which proves that 


! 
D(z) = li nt z, 
(2) noo zzet+))--(2tn)- 


6.20. Exercise. Recall that the Mobius function p(n) can be defined by 
the formula 


(9)? = 37> A. for Re(s) > 1, 


n> 
and let M(x) := D7 ,<, M(n). Observe that |M(x)| < x for x > 0. 
a) Verify that the function defined on the open half-plane Re(s) > 1 by the 
Dirichlet series ~~, u(r) 


which contains the closed half-plane Re(s) > 1. Prove that this function 
vanishes at s = 1. 


can be analytically continued to an open set 
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b) Prove that the following formula is valid for all s € C: 


N 
Se a a5 [Merde + MON 


c) Deduce from this that if Re(s) > 1, then 
¢(s)"+ = | M(t)t~'~ ‘dt, 

1 
M(t)dt 


t2 
Hint. You could use a), the fact that M(t)t~' is bounded and Newman’s 


“analytic theorem” to prove that the integral is equal to value of ¢(s)~* 
s=l. 


d) Prove that 


and conclude that the integral 1 


is convergent and zero. 


at 


M 
in {yo aol = MO) 
n<xa 


e) We would like to show that the prime number theorem implies that 
M(x) := Vince M(@) = o(@). Let A(x) = nce (nm) logn. Prove that 

” M(t 
H(x) = M(x)loge— f a a 
0 


then that H(x) = >> 


number theorem (i.e., W(x) ~ x), that 


p(n)w(x2/n), and conclude, by using the prime 


nxx 


n<Kx 


f) Deduce from this the value of the sum 


> y(n) 


6.21. Exercise. In this exercise, we denote by 7(X) the number of prime 
numbers smaller than X. We will use the following form of the prime 


number theorem: 
XxX xX 
nm(X)= + O . 
) log X ( (log X)? 


a) Prove that the following two estimates hold whenever a > —1): 
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x Qa atl 
| Ln EO | 
2 (logt)? (log X)? 
x Qa atl atl 
i ey aes FO =), 
2 logt (a+ 1) log X (log X) 


b) By using Abel’s summation formula, prove that if f is a continuously 
differentiable function, then 


xX 
>» fe) = n(x)900) - f n(t)f’(t)dt. 


DRX 


c) Prove, still assuming that a > —1, the following generalization of the 
prime number theorem: 


> pe 7, xorl O xorl 
pax (a+1)logX | (log 


Chapter 5 


Elliptic Curves 


“Mais ot. sont les neiges d’antan?” 


FRANGOIS VILLON 


An elliptic curve can be defined as a smooth projective curve of degree 3 
in the projective plane with a distinguished point chosen on it. The set 
of points on the curve can thus be endowed with a natural additive group 
structure. The most concrete description of an elliptic curve comes from 
its affine equation, written as 


y° =a? +ar+b, with 4a? +27b? 40. 


The theory of elliptic curves is a marvelous mixture of elementary math- 
ematics and profound, advanced mathematics, a mixture which moreover 
lies on the crossroads of multiple theories: arithmetic, algebraic geometry, 
group representations, complex analysis, etc. Here, we will provide an in- 
troduction to the subject and prove the main Diophantine theorems: the 
group of rational points is finitely generated (the Mordell- Weil theorem) 
and the set of integral points is finite (Siegel’s theorem). Finally, we will 
evoke the famous theorem of Wiles—whose proof resulted in the proof of 
Fermat’s last theorem—and the Birch & Swinnerton-Dyer conjecture. 


1. Group Law on a Cubic 


Here, the word “cubic” designates an algebraic curve C’ in the projec- 
tive plane P? defined by a homogeneous equation, F(X,Y,Z) = 0, of 
degree 3. The curve is smooth if it has a tangent line at each point, 
( OF OF OF 

OX? OY’ OZ 
tion to projective geometry). If F € K[X,Y, Z], recall that we denote by 


(0,0,0) (see Appendix B for an introduc- 


ie., if 


> 


M. Hindry, Arithmetics, Universitext, 169 
DOI 10.1007/978-1-4471-2131-2_5, 
© Springer-Verlag London Limited 2011 


170 5. Elliptic Curves 


C(K) the set of rational points on K, in other words, the set {(x,y,z) € 
P?(K) | F(2,y, z) = O}. 


1.1. Definition. Let C be a smooth cubic. If P and Q are distinct points 
on C’, the line joining P and Q cuts the cubic at three points, P, Q and 
a third point R (possibly equal to P or Q) that is denoted by R= PoQ. 
If P = Q, we do the same operation with the line tangent to the curve C’ 
at P. We define the law + by choosing a distinguished point called the 
“origin” O € C and setting O’ = Oo O, then 


P+Q:=O00(PoQ) and = P= 01.0 P. 


The procedure which defines this addition law is called the “chord-tangent’ 
method. 


1.2. Theorem. The law defined by the chord-tangent method on a smooth 
cubic is a commutative group law, where the identity element is given by 
the distinguished point O. If O € C(K), then C(K) is an abelian group. 


Since PoQ@ = QoP, the law + is obviously commutative. Indeed, we know 
that O+ P = P, since O, P and Oo P are colinear. If Q = —P, then Q, 
O’ and P are colinear, hence O’ = Qo P and Oo O’ = O, and therefore 
O=Q+P. The only tricky point is the associativity. We will use the 
following two classical lemmas to prove this. 


1.3. Lemma. Let P,,..., Ps be eight distinct points in P?. Assume that 
no subset of four of them is ever colinear and no subset of seven of them 
ever appears on the same conic. Then the vector space of homogeneous 
polynomials of degree 3 which vanish at P,,..., Ps is of dimension 2. 


Proof. Let n be the dimension that we are looking for. No matter how 
the eight points P; are positioned, we know that n > 10 — 8 = 2. Without 
loss of generality, if P,, P2, P3 are colinear, then we can choose Py on the 
same line, whose equation is given by L = 0. Every cubic F' which vanishes 
at P,,..., Pp is therefore of the form DQ, where Q vanishes at P,,..., Pg. 
But by Lemma B-1.18, given five points such that any four of which are 
not colinear, there is only one conic which contains all five, say Qo = 0, 
and F is a multiple of LQo. The dimension no of the space of these cubics 
is therefore equal to 1. Thus n < ng +1 = 2. Now suppose that P,,..., Ps 
lie on a conic Q = 0 and choose Pg on this conic. Every cubic F' vanishing 
at P,,...,P 9 is therefore of the form LQ, where L = 0 is the equation of 
the line (P;, Pg). The dimension no of the space of these cubics is therefore 
equal to 1. Thus, n < m9 + 1 = 2. In the general case (no three-tuple 
of points lie on the same line and no six-tuple of points lie on the same 
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conic), we introduce Py and Pig which lie on the line (P;, P2) with equation 
L=0. Ifn > 3, there would exist a nontrivial cubic F = 0 passing through 
P,,..., Pio, but then F = LQ and the conic with equation Q = 0 would 
pass through P3,..., Ps. 


1.4. Lemma. Let P,,..., Po be the intersection points of two cubics Cy 
and C2, one of which is irreducible. Suppose that P,,..., Ps are distinct. 
If a cubic C passes through P,,..., Ps, then it also passes through Po. 


Proof. If, for example, C, is irreducible, then it contains neither four co- 
linear points, nor 7 points on the same conic. By the previous lemma, 
the vector space of cubics vanishing at P,,..., Ps has dimension 2 and is 
therefore generated by the equations of C, and C>. 


Proof. (of Theorem 5-1.2) Let P, Q and R be three distinct points on 
the cubic C. The line L; = (P,Q) intersects C at P, Q and T; the line 
Lz = (T,O) intersects C at T, O and T’; the line L3 = (R,T’) intersects C 
at R, T’ and U; and finally, the line L4 = (U,O) intersects C at U, O and 
U’, so that (P+Q)+ R =U". Moreover, the line MM, = (Q, R) intersects C 
at Q, Rand S; the line Mz = (S,O) intersects C at S, O and S’; the line 
M3 = (P, S’) intersects C at P, S’ and V; and finally, the line M4 = (V, O) 
intersects C at V, O and V’, so that (Q+ R)+P=V’. We want to show 
that U’ = V’, which is equivalent to U = V. To do this, consider the cubic 
C7; = Ty + M2 + Lz and C2 = M, + LI» + M3. Then, 


CnC, ={P,Q,R,0,T,T',S,S',0} and 
CNC — (Pl) BOT 28, cas 
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If the points P,Q, R,O,T,T’, S, S’ are distinct, we can conclude, by Lemma 
5-1.4 applied to C and C), that U = V. This is the case in general, and 
we can conclude that the equality (P + Q) + R = (Q+ R)+ P is always 
true by invoking a continuity argument (either for the usual topology if we 
are working over R or C or the Zariski topology in the general case—see 
Appendix B, Lemma B-1.9). 


We would like to point out that this construction only uses the simplest 
two cases of Bézout’s theorem on the intersection with a line or a conic (see 
Lemmas B-1.13 and B-1.14). Finally, we have the following operations over 
a curve: “translation by Q”, defined by Pt P+Q, and “multiplication by 
N”, defined by Pts P+---+P (N times) and denoted by [JN]. 


We will now explain this group law with a simpler model, known as the 
“Weierstrass” model. 


1.5. Definition. A Weierstrass cubic is a curve given by a plane cubic 
equation of the form 


Y?Z = X*+aXZ? + 62°, (5.1) 
where A := 4a? + 27b? 40. 


1.6. Remarks. The condition A 0 precisely means that the curve 
does not have a singular point. The curve defined by (5.1) has an obvious 
point, O := (0,1,0), that we will take to be the origin and which is an 
inflection point, i.e., the tangent Z = 0 intersects the curve only at this 
point with multiplicity 3. It can be shown that every smooth cubic which 
has a rational point over K is isomorphic to a Weierstrass cubic, at least 
whenever K does not have characteristic 2 or 3. If we wish to include the 
case of characteristic 2 and 3 (for example, to study elliptic curves over Fas 
or F3), we must use an equation which is more general than (5.1), namely 
an equation of the form 


48 ay XVZ bagY¥ 2? =X? eZ + XZ tae. (5.2) 


In fact, the latter is the general equation of a cubic having an inflection 
point at (0,1,0), with tangent Z = 0 and that is normalized by the dilation 
(scaling) (X,Y, Z) (aX, BY, yZ) so that the coefficients of the monomials 
Y?Z and X° become equal to 1. Take note that if the characteristic of the 
field is not 2 or 3, we can easily reduce (5.2) to the form (5.1). To see 
why this is true, by letting Y’ := Y + (a1X + a3)/2, we can transform the 
4a + at 2 
m1 xX 


equation into Y’7Z = X?4 


4ag + a? 
X + 5 


Z+...; by then setting X’ := 


Z, we obtain a Weierstrass equation of the form (5.1). 
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Coming back to the Weierstrass model, (5.1), we will often work in the affine 
coordinates x := X/Z and y := Y/Z by considering the point O as “the 
point at infinity”. The affine equation is therefore of the type announced 
in the introduction: 

y =x? +art+b. (5.3) 


A possible singular point would satisfy 2y = 32? + a= 0, hence y = 0 and 
x is a double root of the equation 2° + az + b = 0, whose discriminant is 
precisely 4a? + 2767. Moreover, if a is a root of x? + az +b, then the point 
P := (a,0) is a point of order 2; thus there are three points of order 2. 


1.7. Proposition. (Explicit group law) If P; = (x1, y1) and Po = (x2, y2) 
are points on the curve whose equation is given by (5.3), then 


[-1](Pi) = (@1,-1). (5.4) 

If Py = [-1)(Pi) (ie., of a1 = xo and yo = —y1), then P; + Pp = O. If 

Py, = Py, we setX= ate and 4 = y1 — AX1, and if Po A +P, (i.e., if 
rq # 2X1), we set = _ and pp = y, — Ax1. Then, 

P, + Po = (N — 2 — &2,—N? + A(a1 + @2) — p). (5.5) 


Proof. Let y = Ax + w be the equation of the line (P;, P2) (resp. of the 
tangent to the curve at P,) if P; 4 P2 (resp. if P,; = P2). Then A and py are 
given as in the statement. If P; = (a3, y3) is the third point of intersection, 
then P, + P2 = (x3,—y3). To compute the intersection points of the line 
and the curve, we make a substitution for y to obtain the equation 


ag? + ax +b— (An + p)* = 2? — 22? + (a — 2A) + (b— ye”) = 0, 


of which we know two roots: x, and x2. From this, we obtain 71+2%24+ 273 = 
A? and y3 = Axv3 + pas in the statement of the proposition. 


To verify continuity of the addition law in the Zariski topological sense 
(Definition B-1.7), observe that 
Y1 — Y2 ti+aytet+e+a 


— al 5.6 
L1— XQ Yi + Y2 (5.6) 


For future use, we also have the following formulas (that can be checked 
by direct computation): 


a(P+Q)+2(P—-Q)= 
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(x(P)a(Q) — a)? — 4b(2(P) + #(Q)) 
(x(P) — 2(Q))? 
a(P)* — 2ax(P)? — 8ba(P) + a? 
4 (x(P)° + ax(P) +b) 


a(P + Q)x(P — Q) = ; (5.8) 


x(2P) = 


2. Heights 


We will introduce a precise notion of “size” or “arithmetic complexity” for 
the algebraic points in a projective space, which will be christened “height”. 
The first version is sometimes called the Weil height and the refined ver- 
sion, over elliptic curves, the Néron-Tate height, which we will prove has 
a quadratic nature. 


2.1. Weil Heights 


We will start by defining the height of a point in a projective space first 
with rational coordinates, then with algebraic coordinates. From this, we 
will deduce the notion of the height of an algebraic number. 


2.1.1. Definition. If P is a point in P”(Q), we can choose projective 
coordinates for it, (@o,...,@n), where x; € Z and gcd(a,...,@%n) = 1. We 
thus define the height (resp. the logarithmic height) of P by 


H(P) := max (|Xo|,.--, |@nl) (resp. h(P) := log max (|%o],.--, |a@n|))- 


This very simple and natural definition does not translate very easily into 
algebraic coordinates, and it is technically more convenient to reinterpret 
height in terms of the set of absolute values of a field. 


2.1.2. Definition. An absolute value v over a field K is a map x ||, 
from K to Rx such that for every z,y € Kk, 


i) |a|, = 0 if and only if x = 0; 
li) |rylo = |alolyles 
iii) there exists a constant C,, > 0 such that |x + y|, < C, max{|z|y, |y|o}- 


If v satisfies the more precise inequality |x + y|, < max(|z|v, |ylv) (i-e., we 
can take C’, to be 1), v is said to be ultrametric. 


2.1.3. Example. The standard absolute values over the field K = Q 
are the usual absolute value (denoted |2| or |z|..) and the p-adic absolute 
values (denoted |z|,). For every prime number p, the p-adic absolute value 
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is defined for x = tp{’--- pr, where e; = ord,,(x) € Z, by 


|x| = p~ ordy (x) 
The p-adic absolute values are ultrametric. We denote by Mq the set of 
absolute values, which we will also call places of the field Q. 


2.1.4. Remark. If |-| is an absolute value, then | - |° is another one (for 
every a > 0). Ifa map | - |, satisfies 7) and zi) and the triangle inequality 
itt)’ given by |x + yly < |x| + l|ylv, then it satisfies 712) with C, = 2 and is 
hence an absolute value. Conversely, we will leave it as an exercise to prove 
that an absolute value for which we can take the constant in inequality 7iz) 
to be 2 satisfies the triangle inequality. The reason that we take 7iz) in the 
definition is that the condition is stable when we replace |-| by | -|*, which 
is not the case for the triangle inequality. 


2.1.5. Theorem. (Product formula for Q) Let x € Q*. Then, 
[] J2=2 (5.10) 


veEMa 


Proof. We write « = +pj' --- p&", where e; = ordy,(x) € Z. For 1 <i<r, 


we have |z|p, =p; “’. If p does not appear in x, then |x|, = 1 and the usual 
absolute value equals |z|.. = pj' ---pé". The formula follows from this. 


2.1.6. Corollary. Let P € P"(Q) and (ao,...,4n) be (any) projective 
coordinates of P. Then, 


H(P)= [[ max (\zolv,-.-, |tnlv) - (5.11) 
veEMa 


Proof. By the product formula, we know that the right-hand side is inde- 
pendent of the projective coordinates. Therefore, if we choose x; € Z to be 
relatively prime, we have, for each prime number p, max (|%o|p,.--,|@n|p) = 
1, and the right-hand side will indeed be equal to max (|Zo|o0,---3|€nloo)s 
in other words to H(P). 


In order to generalize heights to algebraic coordinates, we will define stan- 
dard absolute values over a number field K. 


2.1.7. Example. Let K be a number field with 7; real embeddings and 
rg pairs of complex embeddings, so that n := [K : Q] = rg + 2ro. Every 
embedding o : K — R or C produces, by composition with the modulus, 
an absolute value. If the embedding is complex, then o and its conjugate 
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produce the same absolute value. This then gives us r; +12 absolute values: 


lel lo(x)| if o is real, 
i be o — 
|o(x)|? if o is complex. 


Now if p is a prime number which factors into p@x = py -- -pg’, with 
Np; = pf and ye: fi =n, then for every prime ideal p, we can define 
the absolute value 

ll» -=Np- ord, (x), 


We denote by Mx the set of these absolute values and by Mx. the subset 
of Archimedean absolute values. 
It follows from these choices that for « € K, we have 


[Lele = ING @)I, and I] lth =|N6@|,,- 6-12) 
p|p vEMK,co 


To understand this statement, if rO~ = Th prdr(*) then we can write 


NG (2) =+N(«0xK) = +[[npre® == [[eerr* pedal) 
p Pp 


Hence we have 


INQ(2)|, = pein te ordp (x) _ [[ner?® = II |[t|p. 


pip plp 
For the Archimedean places, we have 
Tl ritre 
INMl=| IE e@)=[[le@! IT li@a@|= TI 
o:KGC i=1 j=riti vEM Kc 


This gives us the following formula, which is analogous to Theorem 5-2.1.5. 


2.1.8. Theorem. (Product formula for kK) Let « € K*. Then, 
II lel. =1. (5.13) 
veEMK 


Proof. We can regroup the places of K into packets over a certain place of 
Q, and we can use the previous formula: 


ID t= IL [Lew = I] iéel,=2 


weMxK veMgw |v veMa 


2.1.9. Definition. Let P ¢ P”(K), and let (xo,..., Zp) be (any) projec- 
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tive coordinates of P. The height relative to the K is the number 


Hx(P) = [| max(|zol.,.-.,|2nlv)- (5.14) 
vEMK 


The height of a point considered in different algebraic extensions varies in 
a simple way. 


2.1.10. Lemma. If K’' is a finite extension of K and P € P"(K), then 
Hx (P) = Hx (P) #1, (5.15) 


Proof. Let (ao,...,;%n) be projective coordinates in P. We can assume 
that x; € K. If v is a place of K and w ranges over the places of K’ over 
K, we clearly have 


TDs [Tet apc 
wiv * wiv ¢ i 


We thus have 


Hx (P)= |] max |2ri|w = II [| max |2i\u 


weMy vEeMK wiv 
K':K K':K 
= |[ max|2i*! = He(P) i). 
a 
veMK 


This lemma allows us to define the absolute height, which is defined on the 
set of points with coordinates in Q, the algebraic closure of Q. 


2.1.11. Definition. We define H : P”(Q) — Ras follows: if P € P" (Kx), 
we let 
H(P) := He (P)VEQ, 


If a € K, we define the height of a (relative to the field K) as the height 
of the point (1,a) € P'(K). 


To establish a connection between the height of an algebraic number and its 
minimal polynomial, we will use the following classical lemma (see Lemma 


2-6.2.3), which is valid for Z[X]). 


2.1.12. Lemma. (Gauss’s lemma) Let P,Q € K[X], where K is a number 
field. We denote by ||P||y the sup-norm of the coefficients of P for an 
ultrametric absolute value v. Then, 


|PQllo = ||Pllv [lQllo. (5.16) 
Proof. By localizing (i.e., replacing Gx by @ := {x € K| ord, x > O}), 
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we can reduce to the case of showing that if ||P||, = ||Q||, = 1, then 
||PQ||, = 1. Thus if 7 is a generator of the maximal ideal in @ associated 
to the absolute value, then we can write P = 7 P* and Q = 7”Q", where 
\|P* ||» = ||Q*||v = 1. Now, ||P], = 1 means that P € G[X] and is non- 
zero in O/nG@|X]. Since the latter ring is integral, the product PQ stays 
non-zero in @/7@|X], and thus ||PQ||, = 1. 


2.1.13. Lemma. Let a be an algebraic number and K = Q(a). Let the 
minimal polynomial of a in Z[X] be written in the form 


P(X) = a9(X — a1) -++(X — ag) = 9X4 +... 
Then, 
d 
= |ao| | [ max {1, Jail}. (5.17) 
t=1 
Proof. Consider the field L := Q(ay1,...,aq). Then 
Hy (a) = Hx(a)!"8l = J] max(1,lal,) []  max(1,|alw). 
qeMy, WEML, oo 


First of all, we have 


[L:K] 
[] max. lelw) = J] max(1,lal,)*! = - (Tmt i) 


WE ML, oo vEMk,oo 


Gauss’s lemma applied to P and its factorization shows that, for q © Mr, 


we have 
d 


1 = ||Pllq = laolq | | max(1, Jala). 


i=1 
By taking the product over q and invoking the product formula for L (ap- 
plied to ao), we obtain 


d 
LE lola TT Tf max(t, Jail) = laol-% { TT max(1, jag) 


qeMr i=1 qe My qeM, 


Combining these results gives 


d 
Hx (a1 = |ag|9N/4 TT max(1, jas), 
i=l 


which yields the desired equality by taking the [L : K]th roots. 
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The main merit of the height function introduced in this section is the 
following finiteness theorem, whose first part is due to Northcott and the 
second to Kronecker. 


2.1.14. Theorem. (Northcott, Kronecker) Let d >1 and X > 0. Then 
the set S(n,d,X) = {P € P™(Q) | [Q(P) : Q] < d, H(P) < X} is 
finite. Furthermore, we have H(P) > 1, except if the point P has projective 


coordinates all equal to zero or a root of unity. 


Proof. Let P = (a9,...,2%n) € P"(Q). Up to permuting the coordinates, 
we can assume that 2 #4 0. Then we can write P = (1,q1,...,Qn), 
where the a; are algebraic. It is trivially true that H(a;) < H(P) and 
[(Q(a;) : Q] < [Q(P) : Q]. It therefore suffices to prove that the set of 
algebraic numbers {a € Q | [Q(a) : Q] < d, H(a) < X} is finite. A bound 
on the degree and the height gives, by Lemma 5-2.1.13, a bound on the 
coefficients of the minimal polynomial of a, which proves the finiteness. For 


the second assertion, we can again only consider points P = (1,q1,..., Qn) 
where the a; are algebraic. If H(P) = 1, then |a;|, < 1 (for all ¢ all v), 
hence this stays true for a”. Thus the set of points (1,a7",...,a%”") is 


finite, which implies that every a; is zero or a root of unity. 


2.1.15. Lemma. Ifa and (3 are two algebraic numbers, then 
4 H(a)H(8) < H(1,0 + 8,08) < 2H (0) H(8). (5.18) 


Proof. We can reason according to the cases Jal, < [Gly <1, Jaly <1 < |Bly 
and 1 < lal, < |Gly. Whenever v is an ultrametric absolute value, the 
following equality can be checked directly: 


max (1, |v + Blv, |~B|v) — max (1, |a|v) max (1, |Alv) ¥ 


For an Archimedean absolute value satisfying the triangle inequality, we 
obtain the bounds 


© ymax (1, |a|y) max (1, |3l>) < max (1, |o + Bly|, |oBle) 
< 


2 
2max (1, |a|,) max (1,|G]»). 


The proof of the lemma follows from taking the product of these inequali- 
ties. 


2.1.16. Theorem. Let Po,...,Pin be a family of homogeneous poly- 
nomials of degree d in x = (X0,...,Xn). Let Z be the location of the 
common zeros of the P; and ® : P”® \ Z > P™ the map defined by 
®(x) = (Po(x),.-., Pin(x)). 
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i) There exists a constant C, = C,(®) such that for x € (P” \ Z)(Q) we 
have 
H(®(x)) < Ci H(x)%. (5.19) 
wi) Let V be a closed subvariety of P” such thatVNZ =. Then there exist 
two constants Cy = Ci(®) and Cy = C2(®) such that for « € V(Q), 


CoH(x)* < H(®(x)) < CL H(2)?. (5.20) 
Proof. The first inequality can be deduced by repeatedly applying the 
triangle inequality (usual and ultrametric). Let x = (xo,...,%p) and x’ := 
ai? ---a'n and we call K a field of rationality of 2. We write P; = > j al xd 


and denote by N = (8) the number of monomials of degree d. Finally, 
let N, be a constant such that for all 71,...,21 € K we have: 


jay +--+ +an|o < Ny max(|xilo,.--,|enlv)- 


Observe that we can take N, = 1 for the ultrametric places and N, = N 
for the Archimedean places. We can therefore write, for every place v of 
K 


o] 


|Pi()|, < Noma |a{?| max ails 
J vt 


By setting A, = max; Jas” 


, we see that A, = 1, except for a finite 
Vv 


number of places. This gives us 


= Leek) <[]% Ay max || = (IIs) Hx(a)4 


and, by taking the [K : Q]th roots, we obtain the first inequality with 

= ([], NvAv)!/'*'9!. For the second inequality, we rely on the Hilbert’s 
Nullstellensatz (see Theorem B-2.1), which says, in light of the given hy- 
potheses, that if Q; = --- = Q; = 0 is a system of equations of V, there 


exist polynomials AG ) and BY ) and an integer M > 1 such that 
M _ S~ AMP, + 5° BY Qi. 
i=0 i=1 


Observe also that we can assume that the AG ) are homogeneous of degree 
M —d and with coefficients in kK. If we apply this to a point x € V, we 
obtain 


By applying the triangle inequality as before, we obtain 


jajl” <(m+1)y max |A(2)] max [Pi(2)], <A, max fei}! * max | Pi(2)| 


vy? 
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with A’, = 1, except for a finite number of places. This gives us 
d 
max |3|,, < Al, max |Pi(x)|, - 


By taking the product over the places v and the [K : Q]th root, we obtain 
the desired result. 


Notation. We set hx = log Hx and h = log H, and we call it the loga- 
rithmic height. With this convention, the conclusion of the inequalities in 
part iz) from the previous theorem can be rewritten as 


h(®(2)) = dh(x) + O(1). 


We will now return to the study of elliptic curves and define a Weil height. 


2.1.17. Definition. Let E Cc P? be an elliptic curve given by a Weier- 
strass equation Y7Z = X3+aXZ? + bZ3. For P € E(Q), we define the 
height of P by 

h(a(P)) if P40 

0 if P= Op. 


2.1.18. Theorem. There exists a constant (dependent on E) such that 
the height over E satisfies 


~C, < A([2|(P)) —4A(P) < Ch. (5.21) 


Proof. We can ignore the case where P = 0 or is 2-torsion. By invoking 
the duplication formula (5.9), we see that if we set 


(T,X) = (4T(X8 + aXT? +47? X* — 2aX7T? — 8bXT* + a*T*), 


then ®(1,2(P)) = (1,2(2P)). On the other hand, the polynomials 2? + 
ax+b and x*—2ax?—8bxr+<a? are relatively prime under the condition that 
Ao = 4a? + 27b? is non-zero. To see why this is true, a direct computation 
or applying the Euclidean algorithm yields the identity 


(3x? +4a) (x* —2ax” —8b2+a”) — (32° —5ax—27b)(2?+ax+b) = 4a°+27b*. 
(5.22) 


By applying Theorem 5-2.1.16 to ®: P! > P! with d= 4, we have 
A([2](P)) = h(1, e(2P)) = h(®(1, 2(P))) 
= 4h(1,2(P)) + O(1) = 4A(P) + O(1). 


1We are talking about a logarithmic height; furthermore, for reasons which are unim- 
portant in this context, this height is equal to 2 times the height commonly used. 
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2.1.19. Theorem. The height over E is symmetric (t.e., h(—P) = h(P)) 
and almost satisfies the parallelogram law, in other words: 


A(P + Q) +h(P — Q) = 2h(P) + 2h(Q) + O(1). (5.23) 


Proof. The formula is trivially true when P or Q is zero; it is also true if 
Q = +P by the previous theorem. We can therefore assume that P,Q € 
E\ {0} and Q#+P. Let x, = 2(P), x2 = 2(Q), 73 = x(P + Q) and 
v4 = «(P — Q), and also 21 + eo = U, 4122 = v. The formulas (5.7) and 
(5.8) can be rewritten as 


2u(a+v) +46 


L3 +244 = ? 
u? — Av 
(v — a)? — 4bu 
£304 = 
u? — Av 


Thus if we introduce the map from P? to P? given by 
&(T, U,V) := (U? — ATV, 2U(aT + V) + 40T?, (aT — V)? —40TU), 


then the three polynomials do not have any common zeros in P? (the 
verification of this below uses the condition that 4a? + 27b? 4 0). By the 
second part of Theorem 5-2.1.16, we thus obtain 


h(®(T,U,V)) = 2h(T, U,V) + O(1). 
Furthermore, if we let ~ : (E\{0g})* — P? be defined by (P,Q) = 
(1,2(P) + 2(Q),«(P)x(Q)) and (P,Q) = (P + Q, P — Q), we see that 
A(W(P, Q)) = h(a(P)) + h(a(Q)) + O(1) 
by Lemma 5-2.1.15 and also, by using formulas (5.7) and (5.8), that 
wop=Poy. 
This implies that 
A(P + Q) + h(P— Q) = h(a(P + Q)) + A(a(P — Q)) 
=h(1,2(P + Q) 
+ a(P —Q),x(P + Q)x(P - Q)) + O(1) 
= h(Wo w(P,Q)) + O(1) 
= h(®(b(P, Q))) + O11) 
= 2h((P, Q)) + O(1) 
= 2h(P) + 2h(Q) + O(1). 


To complete the proof, we will check that if 6(T,U,V) = (0,0,0), then 
T=U=V=0. This isimmediate if T = 0. IfT 4 0, weset « = U/2T and 
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w = V/T; we thus obtain 2?—w = 2x(a+w)+4b = 0 and (w—a)?—8bz = 0. 
By eliminating w, we find 2+ — 2ax? — 8bxa + a? = x? + ax + b = 0, which 
is impossible according to the identity (5.22). 


2.2. Néron-Tate Heights 


If C is a curve embedded in the projective space P”, we can define the 
height of a point on C' as the height of the point in the projective space 
that contains it. The inconvenience of this definition is its nonintrinsic 
character. We will now offer a modification of this height which will get rid 
of this inconvenience. 


2.2.1. Lemma. Let S be a setandd>1. Ifh:S—-~Randf:S—-S 
satisfy |ho f —dh| < C, then for all x € S, the sequence (d~"h(f"(ax))) is a 
convergent sequence, which we will denote by the limit hy (a). Furthermore, 
for every x €S, 

h(x) —hy(a)] < —e, (5.24) 

d—1 
hy(f(a)) = dhy(a). (5.25) 

Proof. By writing the inequality in the statement of the lemma at the point 
f*-'(z) and dividing by d*, we obtain 
C = ; = te = C 
ae d"a(f*(x)) —d-***a(f* (a) < ae 
By summing these inequalities from n+ 1 to m (with n < m), we can 
conclude that 


C —m m i FY n C 
hi" A(f™(x)) —d-"A(f (2) < gD 


Thus d~"h( f”(a)) is the general term in a Cauchy sequence, which we will 
denote by the limit h f(z). By letting m tend to infinity, we thus obtain 


C —n, n C 
~ d"(d—1) <hy(x) —d A F"(x)) < a(d—1) 


In particular, 


C ; C ’ 
< _ Ss 
a_i = hy(a) — h(x) < 7s) Finally, 


hy(f(2)) = lim a *a(f"* (x) = d lim dT Ff"? (x) = dhy(a). 


By applying this lemma to the Weil height of an elliptic curve and to the 
morphism [2] : # — E (with d= 4), we obtain the following theorem. 
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2.2.2. Theorem. (Néron-Tate) Let E be an elliptic curve defined over a 
number field K. We define a height, called the “canonical” or “Néron- Tate” 
height, by the formula 


h(P) = lim ——~——~. (5.26) 


This height over E satisfies h(P) = h(P)+O(1) and also the parallelogram 
law: 


h(P + Q) + h(P — Q) = 2h(P) + 2h(Q). (5.27) 


It is therefore quadratic. In particular, h(mP) = m?h(P). Finally, h(P) = 
0 if and only if P is a torsion point. 


Proof. We can apply Lemma 5-2.2.1 to the height h(P) = h(#(P)) and 
to the map P + [2](P) with d = 4. The inequality in Theorem 5-2.1.19 
applied to the points P’ = [2”|(P) and Q’ = [2”](Q) gives 


—C <A(2"](P + Q)) + h([2"|(P — Q)) — 2h([2"](P]) — 2h([2"](Q) < C. 


By dividing by 4” and letting n tend to infinity, we obtain the desired 
formula. Thus h is quadratic and in particular satisfies h(mP) = mh(P). 
If mP = 0, we can immediately deduce that h(P) = 0. Conversely, if 
h(P) = 0, then for all m € Z, we have h(mP) = 0. Therefore, the set 
{mP | m € Z} is of bounded height and is hence finite, which implies that 
P is torsion. 


2.2.3. Corollary. If an elliptic curve E is defined over a number field K, 
the torsion subgroup E(K)tor is finite and the group E(K)/E(K)tor ts free 
abelian. 


By skipping ahead to a theorem which we will prove in the following section 
(the group E(K) is finitely generated), we can try to specify the size of the 
generators of E(‘) in the following manner. Theorem 5-2.2.2 can be inter- 
preted as saying that the quadratic form on the lattice E(K)/E(K)torsion 
is nondegenerate. We can be even more precise and prove the following 
theorem. 


2.2.4. Theorem. The real quadratic form E(K) ® R — R induced by h 
is positive-definite. 


We should point out that the fact that a quadratic form Q(x) satisfies 
Q(x) > 0 for all c € Q” \ {0} does not imply that it is positive-definite 
(consider Q(#1, a2) = (a1 + t2V2)?). 
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Proof. Let Q be the quadratic form on R” gotten from h by tensoring with 
R. It is clearly positive. If it were degenerate, it could then be written, 
after a base change, as Q(a1,...,%,) = a} +--+. +a? where s < r. The 
sets {x € R” ; Q(x) < €} would be, for every € > 0, cylinders with infinite 
volume and would therefore contain, by Minkowski’s theorem (3-5.3), a 
non-zero point in every lattice. This would contradict the fact that the 
set {P € E(K) | h(P) < €} is reduced, for small enough ¢, to the torsion 
subgroup. 


There is also a scalar product on E(K) ® R defined by 


(P,Q) = 4 (iP +@) - AP) -2Q)). 


2.2.5. Definition. Let P,,...,P, be a basis for E(/) modulo the finite 
torsion subgroup F’. We define the regulator of E by 


Reg(E/I) := det ((P;, P;)) 


1<i,j<r? 


and we define the minimal height of a point of infinite order by 


hein(E/K) = in A(P). 
(E/K) poe 


These two quantities are exactly the necessary quantities needed to bound 
the height of possible generators of the Mordell-Weil group E(k), in light 
of the following result coming from the geometry of numbers and due to 
Hermite (see Exercise 3-6.6). 


2.2.6. Proposition. There exist constants C such that for every lattice 
L in R", equipped with the Euclidean norm, there exists a basis for L, 
€1,---,€r, such that 


det(L) < lei] --- ler | < C, det(Z). 


3. The Mordell-Weil Theorem 


The goal of this section is to prove the following theorem. 


3.1. Theorem. (Mordell-Weil) Let EF be an elliptic curve defined over 
a number field K (for ecample K = Q). Then the group E(K) is finitely 
generated. 


We could of course reinterpret this theorem by saying that all of the rational 
points on the curve can be found starting with a finite set of points and 
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applying the chord-tangent method to these. An important intermediate 
step is the following result. 


3.2. Theorem. (“Weak” Mordell-Weil) Let E be an elliptic curve defined 
over a number field K (for ecample K = Q). Then the group E(.K)/2E(K) 
is finite. 


Actually, the “weak” Mordell-Weil theorem, together with the theory of 
heights from the previous section implies Theorem 5-3.1, thanks to the 
following descent lemma. 


3.3. Lemma. Let G be an abelian group endowed with a quadratic form 
q:G—R. Suppose that the sets {2 € G | q(x) < X} are finite for all 
X €R and that the quotient G/2G is finite. Then the group G is finitely 
generated. More precisely, if S is a set of representations modulo 2G and 
if C := maxzes q(x), then {x € G | q(x) < C} generates G. 


Proof. Let us first point out that the hypotheses imply that q(x) > 0: if 
there existed x9 where g(a) < 0, we would have, by homogeneity, infinitely 
many elements where q(x) < 0. Let |2| := \/q(x), so that |mz| = m|a| and 
je+y|< |x| +|y| for x,y € Gand meEN). Let x € G where q(x) > C. 
We can define a sequence (x,,) of points in G as follows: start with rp = 2, 
then write 9 = y, + 2”, where y; € S and x1 € G, then x1 = yo + 2y, 
etc. Observe that 


|to — yi | |zo| + |ya| |zo| + VC 
|z1| = 5 < 5 < 5 


We can iterate this procedure and obtain a sequence which satisfies 


< |zxol- 


|tn| < |2n—1] <-++ < [xa] < |aol, 


as long as || > VC. The finiteness hypothesis implies that, after a finite 
number steps, we will have |2,| < /C. The point 2 = ap can be expressed 
as a linear combination of the y; and the x, which are all in the finite set 
{y © G| q(y) < C}, so this set indeed generates G. 


We will now lay out a plan for the proof of Theorem 5-3.2. To make things 
simpler, we will assume that the equation of the curve is given by: 


y? = f(z) = 2° +02 +b= (e —a1)(z — a2)(a — 3), 


in other words, we will assume that the roots of f are rational over kK. In 
particular, the 2-torsion points, P; = (a;,0), are in E(k). This does not 
interfere with the generality of the proof of the Mordell-Weil theorem since 
we can always replace K by the extension K(a1,a2,a3). However, from 
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an algorithmic point of view, it is better to work in K. At the end of the 


section, we will indicate how the proof should be modified to this effect. 


3.4. Definition. We define the map w = (1, 2,73) from E(K) to 
(K*/K =) by the following formulas: 


wi(P) = (a; = aj) (Q%; = Qk) if P= P,, 
1 if P = Op. 


3.5. Remark. In the definition of the homomorphism w, the formula 
for P = P; = (a;,0) is natural since (x — a;)K*? = (x — aj)(@ — aj) K*. 
Another possible definition would be to take w;(P;) = f’(a;) mod K*?. 


By proving the following three lemmas, we will have finished the proof 
of Theorem 5-3.2 since we can deduce from them that E(K)/2E(K) = 


(E(k). 
3.6. Lemma. The map w: E(K) > (K*/K*)° is a homomorphism. 


3.7. Lemma. The kernel of the map w is equal to 2E(K). 


3.8. Lemma. The image y(E(K)) in (K*/K*?)° is finite. 


Proof. (of Lemma 5-3.6) If P, Q and R are three points on the curve E, the 
equality P+Q+R = Oz is equivalent to saying that P,Q and R are colinear. 
Therefore, let y = Ax + pw be the equation of the line D which intersects 
E at P,Q and R. First assume that {P,Q,R}N {Oxz, Pi, Po, P3} = 0. The 
equation f(x) — (Ar+ py)? = 0 therefore has x(P), z(Q) and x(R) as roots. 
If we set x’ = x — aj, then 


Fle’ + a,) — (Aa! + Ao, + p)? =0 
has «(P) — ai, x(Q) — a; and x(R) — a; as solutions, and since f(a;) = 0, 
the constant term is —(Aq; + 11)”. This yields 
(x(P) — a;)(a(Q) — a4) (a(R) — a4) = (ai + 1)”, 

and thus w;(P)v;(Q)v;(R) = 1. The equality R = P + Q implies that 
P,Q and —R are colinear, in other words w;(P)wi(Q)wi(—R) = 1; since 
wi(—R) = Wi(R) = ¥i(R)~', we indeed obtain ;(R) = Wi(P)vi(Q). This 
finishes the proof if {P,Q, R}{0n, Pi, Po, P3} = 0. If R = Op, the relation 
becomes obvious. If not, observe that (a(P)— a 1)(x(P)—az2)(#(P)—a3) = 
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y(P)?; we can check case by case that the relation w;(P+Q) = vi(P)vi(Q) 
is always true. 


Proof. (of Lemma 5-3.7) It is clear that 2E (i) C Ker 4, since the exponent 
of K*/K*? is 2. We need to show that N; Ker y; C 2E(K). Assume then 
that 


for 1 = 1, 2,3, dz; € K* such that 2(P) — a, = 2?. (5.28) 
We will solve the Vandermonde linear system of equations u+va;+ wa? = 
z;. From the equations (u + va; + wa?)? = x — a;, we obtain the system 
u? — 2vwb— «2 =0, 
2Quv — 2uwa — bw? +1 =0, (5.29) 
v? + 2uw — aw? = 0, 
which in particular yields v?+vw?a+bw?—w = 0 and also (by noticing that 


w must be non-zero because if not, then v = 0 and 1 = 0!) the following 
equation: 


Therefore, Q := (4 


plication formula (5.9 
This is because 


€ E(K). A direct computation using the du- 
and the relations (5.29) therefore gives us P = 2Q. 


we - 


Gr ae a 


«(2Q) = 
4 (Gin) + \+ a 
_ vt — 2av?w? — 8buw? + a?w4 
Aw? 
(aw? — 2uw)? 1 9 5 
= ae sae (—2avu* — 8buw + aw*) 
= u? — 2Qvwb — & (v? — aw? + 2uw) 


2 


=z. 


Proof. (of Lemma 5-3.8) Choose a finite set S' of places of the field K such 
that 

i) the element 2Ag = 2(4a® + 2767) is an S-unit, 

ii) the ring @x,gs is principal. 


This is possible because of Remark 3-5.14. Condition i) implies that 
a, — ay € OF g since A = ((a1 — a2)(a1 — a3)(a2 — a3))°. We can now 
write c = A/B and y = C/D where A, B,C,D € @x,g and gced(A, B) = 
gcd(C, D) = 1 (in the ring Ox). The equation y? = (4 — a1) (x — a2) (x 
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a3) can be transformed into O? B® = D?(A — a,B)(A— a2B)(A — a3B). 
Since D is relatively prime to C, we know that D? divides B?, and since 
B is relatively prime to A, we know that B® divides D?. Hence up to 
modifying B and D by a unit, we can assume that B® = D?, B = E? and 
D = E°, which yields 


Ee 
If p (a prime in @x,s) divides (A — a, E*) and (A — a2E?), then it also 
divides (a1 — ag) E? and (a, — a2)A, hence (a1 — a2), which is invertible. 
The factors are relatively prime and are therefore squares, up to a unit. 
Thus we obtain 


(x,y) = ( mene ) and OC? = (A-—a,E*)(A—a2E*)(A—a3E”). 


= A- a, E? 
= E 


where €; € Oj g. As a corollary to the generalized unit theorem, we have 


u(P) — a; = et, 


that OF, 5/O%2g is finite, and we can therefore choose the €; from a finite 
set. Thus, Y(P) = (€1,€2,€3) takes a finite number of possible values in 


(K*/K*?)°. 


3.9. Remark. To make the proof of the Mordell-Weil theorem effective 
computationally, it suffices to find representatives of E(K)/2E(K). The 


proof indicates that it thus suffices to decide, for (€1, €2, €3) € (CLO? 3) 


if the curve defined by the equations A — a; EH? = ¢;Z? has a rational point 
and then compute it. Unfortunately, no such algorithm is currently known. 


We will finish this section by briefly indicating the modifications necessary 
for working with a curve y? = f(x) without leaving the field K of coef- 
ficients of the polynomial f. We introduce the ring A := K[X]/(f(X)). 
By letting a be the image of X, we set #(P) = x(P) — a with values in 
G := A*/A*? if x(P) is not a root of f(X). For the particular case of 
2-torsion points, we proceed as in Definition 5-3.4. 


4. Siegel’s Theorem 


We are now interested in integer solutions. The main result is the following. 


4.1. Theorem. (Siegel) Let C be an affine curve given by the equation 
y’ = f(x) =2° +a2+6, 


where a,b € Ox and A := 4a?+27b? 4 0. Then the set of points P = (x, y) 
on the curve, where x,y € Ox, is finite. 
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4.2. Remark. The smoothness condition is necessary because, for exam- 
ple, the curve y? = x? has every pair (¢?,¢°) as a solution (where t € @x), 
while the curve y? = 23 — x? has every pair (t? + 1,¢? + t) as a solution 
(where t € Gx). We could deduce from the previous theorem (but will not 
prove it) an apparently more general theorem, namely the following. 


4.3. Theorem. (Siegel) Let C be an affine curve given by the equation 


ax? + ba*y + cry? + dy? + ex? + fayt+ gy? +ha+iy+j7=0 


such that the corresponding projective curve is smooth. Then the set of 
points P = (x,y) on the curve, with x,y € Ox.s, 1s finite. 


We will now deduce Siegel’s theorem (Theorem 5-4.1) from the following 
result, also due to Siegel. 


4.4. Theorem. (The S-unit equation) Let K be a number field and S$ 

a finite set of places. The set of pairs of S-units (x,y) € (Oj 5)? which 

satisfy . 
zrty=1 (5.30) 


is finite. 


Proof. (Theorem 5-4.4 implies Theorem 5-4.1) We can, if we want to, 
expand the set S and the field kK. We will therefore assume that Ox, 
is principal, A € Of ¢ and f(x) = (@ — a1)(x — a2)(x — a3). Then let 


(x,y) € (@x,s)* be an integer solution. As in the proof of the Mordell- 
Weil theorem, we deduce from this the factorization: 


r-a,= bz? 


where b; are representatives of Of ¢/@%?.,. We will introduce the algebraic 


numbers 3; = Vb;, which are in a finite extension K’ of K. From x — a; = 
(8;2;:)?, we can deduce the relations 


Oy — Oy = (Biz; — 8525) (Biz + Bj2zj) © OR. 9- 

We will now make use of the “Siegel identities”: 
Biz, Byz; _ Byzj + Ber 
Bizi— Pez  — Bizi — PrZr 
We know from Theorem 5-4.4 that the set of values taken by € := 
D;2; =e 3; ay 
Biz — Bree 


ber of values @;z; and likewise of values for x and hence for y. 


=1. (5.31) 


is finite. It easily follows that there are only a finite num- 
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4.5. Remark. Observe that the reduction of Theorem 5-4.1 to Theorem 
5-4.4 is computationally effective in the following sense: if we had an algo- 
rithm for calculating solutions of the equation of S-units, we would have 
an algorithm for calculating the set of integer solutions to y? = f(z). 


We essentially know two proofs of Theorem 5-4.4: one due to Siegel, based 
on a rational approximation theorem, and one due to Baker, based on his 
theorem of linear forms of logarithms. The disadvantage of Siegel’s proof 
is that it does not explicitly determine the finite set of solutions. This is 
nevertheless the one that we will present. For a sketch of Baker’s proof, 
see the Chap. 6, Sect. 6-4. 


Reduction of Theorem 5-4.1 to Theorem 5-4.4. Let m > 2. We know 
from the generalized unit theorem that the group Oj ¢/Ox"s is finite. In 
other words, there exists a finite set of S-units ¢; such that all S-units 
can be written as x = 6,2” with z € Ox g. Solutions (x,y) of the S-unit 
equation thus provide (a finite number of) solutions to one of the following 
equations: 

€12]° + e225" = 1, (5.32) 


and it suffices to prove that the latter only have a finite number of solutions 
(21, 22) € (Ok g)? or likewise in (Ox,5)”. 


4.6. Proposition. Let a,b € Gx5 and m > 3. The set of S-integral 
solutions of the equation 
axz™ + by™ = 1 
is finite. 
Proof. We will give the proof for the case which has the simplest notation, 


the case Ox,g = Z. Leta = %/—=, and let (Q(a) : Q) =< m. We 
factor aX™ + b = (X — a)F(X) and thus obtain 


z\"+t=(2 Ja (¢)=-+. 
(G) ta-(F-9) oF (F) = Ge 
Observe that the ratio z/y must be close to one of the roots, for example 


to a. Since it must lie at a distance which is bounded below from the other 
roots (those of F'), we get an inequality of the form: 


g-l< ay 


ae m? 
ly| 


(5.33) 


where the constant C; only depends on a. To finish the proof, it suffices 


192 5. Elliptic Curves 


to have a Diophantine approximation statement of the type 

C2 
5 

ly| 


where C2 is dependent on a and 6, and of course 6 < m. In fact, by 
combining inequalities (5.33) and (5.34), we obtain 


x 
v£cQqQ, 


< | Tae (5.34) 


A statement of the same type as inequality (5.34) is provided by Roth’s 
theorem, which allows us to choose any 6 > 2 and hence ad <3<m. An 
— ST tit 
finishes the proof (observe that if m > 2, then 1+ De <m). The proof of 


older result of Thue allows us to choose every 6 > 1+ 


Thue’s theorem is given in the following chapter. Let us nonetheless point 
out that the proof of Roth’s theorem, like that of Thue’s theorem, is not 
constructive in the sense that it does not allow us to calculate the constant 
Cz = C2(a,d). A more constructive method was developed in the 1960’s 
by Baker and will be briefly discussed in the next chapter. 


5. Elliptic Curves over the Complex Numbers 


In this section, we will describe the connection to the classical theory of el- 
liptic functions, thus justifying the name “elliptic curves”, elliptic functions 
taking their name from the fact that they intervene in the calculation of the 
length of an arc of an ellipse. 


We will need to following classical result on complex variables. 


5.1. Theorem. (Liouville) If a function is entire (i.e., holomorphic on 
all of C) and bounded, then it is constant. 


We will now consider 2 := Zw, © Zwe, a lattice in C and study the 2- 
periodic functions, i.e., such that f(z+w) = f(z) for w € 2. Liouville’s 
theorem indicates right away that the only entire functions which are 2- 
periodic are constant functions, a fact which justifies the following defini- 
tion. 


5.2. Definition. An elliptic function is a meromorphic function on C 
which is Q-periodic for some lattice Q. 
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Let us point out that the set of Q-elliptic functions forms a field, which 
is denoted by -@(Q). This field contains the constants, i.e., the field C, 
and is stable under derivation. We can see right away that this field is not 
reduced to only constants. 


5.3. Definition. Let Q := Zw; ® Zwe be a lattice in C. We define the 
Weierstrass function associated to 2 by the formula 


oz) = (= ++ 7 (oy - +] (5.35) 


z weEQ Y 


where )~’ signifies that we leave out w = 0 in the sum. 


The Weierstrass function allows us to give a complete description of elliptic 
functions and to establish a connection to elliptic curves. 


5.4. Theorem. The Weierstrass function 9 is an elliptic function. The 
field of Q-elliptic functions is generated by o and its derivative ¢’, i.e., 
M(Q) = Ce, 9’). Furthermore, these two elliptic functions satisfy the 
following algebraic relation: 

@' (2)? = 4@(z)° — g2@(z) — gs, (5.36) 
where the constants gg and g3 are defined by 

/ / 
g2 = 92(2) =609> 4+ and =~ gg = g3(0) = 1400 +. 
wen Y wen Y 


Finally, we have g3 — 27g? 4 0. 
Proof. (Sketch) The defining series of the derivative, 


is absolutely convergent and uniformly convergent on every compact set 
which avoids 9: it therefore defines a holomorphic function on C \ Q, 
which is clearly 0-periodic and odd. Furthermore, g’ has a pole of order 
3 at every point of Q, and thus g’ € .@(Q). The defining series of o 
shows that it is even and meromorphic with a double pole at every w € 2. 
The periodicity of g’ implies that o(z + w) = e(z) +C,. Let w be one 
of the generators of Q such that w/2 ¢ Q. By taking z := —w/2, we 
obtain o(—w/2) = o(w/2) = e(—w/2) + Cy, hence C, = 0. Thus we 
also know that g € @(Q). In order to prove that W(Q) = Ci, 0’), 
we can decompose a function in -@(Q) into an even + an odd function 
and have thus reduced to showing that a function f which is (elliptic 


194 5. Elliptic Curves 


and even is in C(g). To do this, we prove that its poles and zeros are 
symmetric under the map z ++ —z and of even order in the periods and 
half-periods. Thus a function f has the same zeros and poles as a function of 
the type [],(@(z) — e(ui))™*: the two functions coincide up to a constant. 
In order to prove the relation of algebraic dependence, we compute the 
Taylor expansion of ¢(z) (or rather of g(z) — 277) at z = 0: 


e(z)= 7+ So anz”, where a, =(n+1))> 4. (6.37) 
od n=1 wen 
The calculation of the Taylor expansion (only the polar part and the con- 
stant term) of the function ~(z) = '(z)? —4@(z)? + g2@(z) + 93 shows that 
it is holomorphic and zero at 0. Liouville’s theorem thus implies that the 
function ~(z) is identically zero. 


Finally, the equality g} — 27g} = 0 is equivalent to the fact that 42? — 
g2x — g3 = 0 has a double root, say h. If that were the case, then we would 
2 
g!(z) ) a 
= ¢(z) + 2h, which is a 
2(9(z) — h) 


contradiction considering the zeros. 


have an equation of the form ( 


5.5. Corollary. Let Q be a lattice in C. The map z+ ((z), 9'(z),1) 
extended by w ++ (0,1,0) defines a biholomorphic map from C/Q to the 
projective cubic with points (X,Y,T) (in P?) given by the equation 


TY? =4X3 — goXT? — g3T®. 
Furthermore, the map is an isomorphism of groups. 


Proof. The first assertion follows essentially from the previous theorem. 
The second assertion can be proven by comparing the algebraic addition 
law to the following addition formula on the Weierstrass function: 
ihe _1 (e@)-e) ) 
put 1) = plu) - o(0)+ 4 (SSS) 
For a fixed v, the poles of the left-hand side are double poles at every 
u€ —v+Q. The right hand side actually has the same poles because 
(wu) — 9'(v) 
p(u) — (v) 
term —g(u) and since g(u) — o(v) = 0 if and only if utv € Q, but 
g'(u) — g’(v) vanishes for u € u-+Q. Once we have checked the equality of 
terms corresponding to the poles, formula (5.38) follows. 


(5.38) 


has a simple pole for u € 9], which is compensated by the 


Conversely, we can show, given g2, g3 € C which satisfy A := g3 —2793 4 0, 
that there exists a lattice Q such that go = go(Q) and g3 = g3(Q). Thus, 
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over the field of complex numbers, we can consider an elliptic curve as 
a complex torus, i.e., a quotient C/Q. This point of view clearly shows 
various properties of elliptic curves or of families of elliptic curves. The 
following two propositions illustrate this principle. 


5.6. Proposition. Let EF = C/Q be an elliptic curve. Then 


Ker[N]z = 20/9 ~ (Z/NZ)*. (5.39) 
Proof. To prove this, the map [N]g : C/Q — C/Q is induced by the 
multiplication by N in C, hence Ker[N]g = {z € C | Nz € 0}/Q. Since 


Q. = Z?, the proposition is clearly true. 


5.7. Remark. We can observe that the torsion points allow us to partially 
reconstruct the lattice Q. To be more precise, we easily see that for every 
prime number £, we have? 


2 
lim Ker[é"] = limQ/0"0Q = (iim z/0"Z) : 


n 


We thus introduce the ring Ze := lim Z/¢"Z, and we see that 
lim Ker[¢"] = 0 @ Ze. 


This remark might appear to be pedantic when we are working with elliptic 
curves over C, but it becomes fundamental when we want to work over other 
fields (for example a finite field) since the left-hand side is still meaningful, 
whereas the right-hand side (say 2) does not exist anymore. The following 
definition will clarify things. 


5.8. Definition. Let E be an elliptic curve defined over a field K and fa 
prime number other than the characteristic of K. The adic Tate module 
of an elliptic curve is defined to be 


T)(B) := lim E[e"). 


(Here, E[é"] is the subgroup of points with coordinates in the algebraic 


closure, kK, which are killed by ¢”.) 


5.9. Remark. We point out that if u € C*, then C/Q is isomorphic to 


?Recall that. if (¢@n : Gn4+1 — Gn) is a sequence of homomorphisms, the projective 
limit, lim Gn, is defined to be the set of sequences (%n)n>1, where tn € Gn, which 
n 
satisfy @n(a@n+41) = Zn. It is obviously a group. 
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C/uQ (by multiplication by u). Moreover, we can easily check that 
uw e(uz, uQ) = (z,Q) and ue! (uz, uQ) = o'(z,0). 
Thus we can always, up to isomorphism, replace the lattice Zw; @ Zw, by 


the similar lattice Z © Zr, where we set 7 := 3. Up to exchanging w 


and w2, we can also assume that Im(r) > 0. Every elliptic curve (over 
C) is thus isomorphic to a torus C/(Z @ Zr), where 7 is in the Poincaré 
half-plane, 

KH :={7r €C| Im(7) > 0}. 


The following result specifies when two such curves are isomorphic. 


5.10. Proposition. Two tori E, = C/(Z@ Zr) and E,, = C/(Z © Zr’) 


4 € SL(2,Z) such that 


are isomorphic if and only if there exists e 


, atT+b 
T= . 
ct +d 


In particular, we can identify the space of isomorphism classes of complex 
elliptic curves with the space SL(2,Z)\.#. 


Proof. A homomorphism ¢ : C/(Z@ Zr’) — C/(Z @® Zr) comes from a 
homomorphism from C to C, i.e., by multiplication by a € C such that 
a(Z @ Zr') C Z@ Zr. In particular, a = cr +d (where c,d € Z) and 
at’ = at +b (where a,b € Z). Therefore, r’ = (ar + b)/(cr + d). The fact 


€ GL(2,Z), and since 


that ¢ is an isomorphism translates into . A 


fy os a 
Im(r') = det (° d 


SL(2, Z). 


a Im(r)/|e7 + d|?, we see that the matrix is indeed in 


5.11. Proposition. Let E = C/Q be an elliptic curve, where Q = Z+Zr. 
Then 


Z if [Q(r) : Q] > 2, 


Z+ZAr if [Q(r) : Q] = 2, ea) 


Endl) = foe anc} =| 
where, in the second case, the integer A is the leading coefficient of the 
minimal equation Ar? + Br +C =0. 


In the case where End(£) is a subring of finite index of the ring of integers 
of an imaginary quadratic field, we say that F has “complex multiplication’ , 
or (in algebra) is of “ CM-type”. 
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Proof. In light of the previous discussion, an endomorphism is given by 
multiplication by @ = cr + d and corresponds to a matrix (: 4) with 


integer coefficients such that tr = (ar+b)/(cr+d), in other words cr? + (d— 
a)rt —b=0. If [Q(r) : Q] > 2, the only possibility is to have c = b = 0 and 
a = d, in other words, multiplication by d; if 7 is quadratic and satisfies 
the minimal equation Ar? + Br + C = 0, we can conclude that c = mA, 
d—a=mB and —b= mC, hence a = mAt+d€ Z4+ ZAr. 


5.12. Remark. If we consider End(£) as a subring of C, we can easily 
verify the following formula: 


deg(a) := card Ker(a) = N(a) = aa. (5.41) 
In particular, the map deg : End(£) :— Z is quadratic. Furthermore, the 
ring End(£) acts naturally on the Tate module 


T)(E) == lim E(e". 


6. Elliptic Curves over a Finite Field 


In this section, we will translate some of results from the previous section 
into results on fields of characteristic p, notably to finite fields. See Silver- 
man’s book [70] for the complete proofs. Elliptic curves over finite fields 
are especially useful in cryptography—see for example the text [165]. 


Let EF be a projective plane curve, as in (5.2), given by the equation 
Y°Z + ay XV 2 +a3V 2? =X? ag kX? 7 +ayxe’ +aeZ", 


where a; € F,. The group of rational points E(F,) is obviously finite, and 
in particular, all of the points are torsion. Any such curve also has the 
maps given by “multiplication by an integer n”, but it has a remarkable 
endomorphism, specific to the characteristic p, as well. 


6.1. Definition. The “Frobenius” endomorphism on E//F, is defined by 
the formula 
®,(z, y) = (ety). 


Let us point out that if f(x,y) = 0 (where f(X,Y) € F,[X,Y]), then 
f(a%,y7) = (f(z, y))? = 0, and hence ©, is indeed an endomorphism (it is 
clear that it will also respect the addition law). 

We will admit the following proposition (see for example [70]), which is 
analogous to Proposition 5-5.6. 
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6.2. Proposition. Let EF be an elliptic curve defined over a finite field Fy 


and N an integer > 2. We denote by Ker[N]g = {P € E(F,) | NP = 05}. 
If N is not divisible by the characteristic of the field, then 


Ker[N]z & (Z/NZ)>. (5.42) 


6.3. Remarks. In particular, for a prime number ¢ different from the 
characteristic of F,, we can define the Tate module as we did over the field 
of complex numbers: 

Ti(E) = lim Ker|é"|z. 
We again have that Ty(E) © (Ze)? (as a Zp-module). Note however that 
we do not have a natural lattice 2 © Z? such that T)(E) = 2 @ Zp. 


The statement of the proposition does not stay true for N = p™, where p is 
the characteristic of Fy. It can be shown that in this case either Ker[p”]z & 
Z/p"Z (the “ordinary” case) or Ker[p™|z = {Oz} (the “supersingular” 
case). 


The key result concerning the number of rational points over F, is the 
following. 


6.4. Theorem. (Hasse) Let E be an elliptic curve defined over Fz. Then, 
card E(F,) —q—1| < 24. (5.43) 


More precisely, there exists an imaginary quadratic integer a which satisfies 
aa = q such that 


card E(Fgm) = q™+1-—a"—-—a"™. (5.44) 


Proof. (Partial) The set of rational points is also the set of fixed points of 
the Frobenius endomorphism ®,. We will assume that the degree of the 
endomorphism is given, as with complex numbers, by a quadratic function 
and thus, in particular, that 


deg(n®, + m) = P(n,m) = an? + 2bmn + em’. 


Then we have, on the one hand, card E(F,) = P(1,—1) = a+c-— 20 and, 
on the other hand, c = P(0,1) = 1 and a = P(1,0) = q. Finally, since 
the polynomial P(n,m) is positive-valued, we have that b? — ac < 0 and 
therefore |b| < ,/g, which proves the inequality given in the statement. 


For the formula which comes next, we use an analogy to the complex case, 
where the endomorphisms satisfy a quadratic relation. The Frobenius ®, 
can also be seen as an endomorphism of Tate modules, whose eigenvalues 
a, @ satisfy aa = q. The eigenvalues of (" are therefore a” and a”. 
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Since card E(Fgm) = deg (o” - i), the relation that we want to prove can 
be written as ®7 — (a+ a)®,+q=0. 


7. The L-function of an Elliptic Curve 


This section, which does not contain any proofs, is a stroll in the direc- 
tion of the work of Wiles [80/—-elliptic curves, modular forms and the great 
Fermat’s last theorem—and the famous Birch & Swinnerton-Dyer conjec- 
ture. The stroll continues at the end of the following chapter. Two good 
references to continue along this path are [27] and [37]. 


Let FE be an elliptic curve over Q. Suppose, as above, that it is a projective 
plane curve given by the equation 


y" + a,xry + agy = e+ ag + a4u + a6, 


this time with a; € Z (cf. formula (5.2)). We can see fairly easily that the 
only coordinate changes that preserve the form of the generalized Weier- 
strass equation are of the type: 


c=us' +r, y = uy’ +u2sa! +t. 


We can define the discriminant of the model in the following ad hoc manner. 
If the model can be written y? = 2° + Ax + B, we set A := —16A(A, B) = 
—16(4A? + 27B?); in general, we can always transform the equation into a 
simpler model (A, B), and we will set A’ = u~!7A. A Weierstrass equation 
is said to be minimal and Ag is its discriminant if the discriminant of every 
other equation with integer coefficients is of the form A = u!*?Ap where 
ue Z. 


To simplify things, we will continue the discussion by staying in the field 
of rationals. Reduction modulo p has at most one singular point of cusp 
type, y? = x, or node type, y? + ary + bx? = x3, where the polynomial 
y? + ary + bx? = (y — ax)(y — a’2) is either irreducible (if a € F,2 \ Fp) 
or not (if a € F,) over F,. Geometrically, in the second case, the cone 
tangent to the singular point is composed of two lines y = az and y = a’z, 
which could be either rational over F, or defined over F,,2 and conjugate. 


7.1. Definition. Let p be a prime number and F/Q have a minimal 
model 
y? + ary + agy = 2° + aga? + agx + ag. 


The curve FE is said to have good reduction at p if this model stays smooth 
modulo p (i.e., if p does not divide Ag). A curve E is said to have additive 
reduction at p if this model is singular modulo p and if the singularity has a 
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unique tangent. The curve is said to have multiplicative reduction or semi- 
stable reduction at p if this model is singular modulo p with two distinct 
tangents; if the two tangents are defined over F, (resp. are not defined 
over F,,), we say that the reduction is split (resp. non-split) multiplicative. 
A fairly simple-to-calculate criterion is the following. If we write a minimal 
Weierstrass model, except maybe for 2 and 3, in the following way: y? = 
x? — 27c4x — 54cg, then we have additive reduction if p divides c, and Ag, 
and we have multiplicative reduction if p divides Ag but does not divide 
c4. The model is minimal (except maybe for 2 and 3) under the condition 
that for every prime number p, p* /c4 or p® jc¢. Finally, we define the 
invariant j of the elliptic curve E by 


j= ci/A. (5.45) 


7.2. Remark. The adjectives “additive” and “multiplicative ” come from 
the following observation, whose verification is left to the reader. If a Weier- 
strass cubic F is singular (necessarily at a unique point Po), then the chord- 
tangent method defines a group law on FE \ {Po}. This group is isomorphic 
to the additive group if the singular point is a cusp point and isomorphic 
to the multiplicative group if the tangents are distinct. To be more precise, 
E(K) \ {Po} is isomorphic to (K, +) if the reduction is additive, to (K*, x) 
if the reduction is split multiplicative, and finally to (#1, x) if the reduc- 
tion is non-split multiplicative, where K, = Ker {nx [K* 3K “} and 
K’ is a quadratic extension. Observe that 7 = jg is indeed independent 
of the model, since, by a change in coordinates, we have cy = uc), and 
A=u'?A’. Finally, we can easily verify the following formula: 


ci — c= 1728A. (5.46) 


Now we can resume the practical calculation on the generalized Weierstrass 
model (5.2) by the following list of formulas (due to Tate): 


bo = 0,7 4a 
b4 = 2a4 + a a3 


bg = ag + dag 


bg = atag + 4agag — aya3a4 + azax _ ar 
b2 — 24b, 

—b3 + 36bob4 — 216b¢6 

A = —bbg — 8b3 — 27b2 + Qbababg. 


oO 
w 
I 


Q 
lop) 
II 


7.3. Definition. The conductor of E/Q is defined by Ng = [I, p"”, 
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where 


0 if EF has good reduction at p, 
n(E,p)= <1 if EF has multiplicative reduction at p, 
2+ 0p, if EF has additive reduction at p, 


where 6¢,p =0 if p> 5 and dz2 < 8, dz3 <5. (The precise values of b7,2 
and 67,3 are described in Appendix C.) 
7.4. Definition. Let FE be an elliptic curve defined over Q. If E has good 
reduction at p, we set 

dp = pt+1—card E(F,). 
We then define the function L(E,s) and its local factors by: 


(1 — app-§ re a i if p does not divide Ap, 


(1—p-s)-+ if EF has split multiplicative 
nine reduction, 
pees) (1+ p-*)-? if F has non-split multiplicative 
reduction, 
1 if E has additive reduction, 
(5.47) 
L(E,s8) =~ ann-* = |[ Lp (£,s). (5.48) 
n=1 Pp 


7.5. Proposition. The Dirichlet series and the Euler product defining 
the function L(E,s) are absolutely convergent for Re(s) > 3/2. 


Proof. This immediately follows from Hasse’s theorem (Theorem 5-6.4). 


7.6. Theorem. (Wiles [80]) The function L(E,s) can be analytically con- 
tinued to an entire function which satisfies the following functional equa- 
tion, where we let A(E,s) := N3/? (2n)-5T(s) L(E, s), 


A(E,s) = +A(E,2—). (5.49) 


Observe the obvious analogy to functional equation of Riemann zeta func- 
tion. The theorem of Wiles is actually more precise and “explains” the 
functional equation of the Dirichlet series L(E,s) = )>>*_, ann~* by the 
fact that associated function (for z in the Poincaré half-plane), fz(z) = 


yo Gn exp(27inz), is modular with level Ng and weight 2 and satisfies 
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the functional equation 


fe (- a ) =+Npgz"fr(z). (5.50) 


A formal computation, analogous to the one done to prove the functional 
equation of the ¢(s) function, shows that the (5.50) for fg implies the func- 
tional equation for L(E,s) (see Chap. 6, the last section for more details). 


The connection between Wiles’s theorem and Fermat’s last “theorem” is 
the following. Starting with a hypothetical solution to Fermat’s equation 


a+h+e=0, 
we construct an elliptic curve, called the Frey or Hellegouarch curve: 
y? =2(4—a*)\(r+ 0°). 


By examining this hypothetical curve, we notice that it has (or would 
have) some remarkable properties. For example, Ag = 278(abc)?" and 
Ng= IL, abe Pi this implies that the associated Galois representation (see 
Appendix C) has very little ramification. The associated modular form 
(from Wiles’s theorem) would have even more extraordinary properties: 
by appealing to a theorem due to Ribet [59], we could lower its level Nz 
down to 2. However, there does not exist such a non-zero form with level 
2, which finishes the proof of Fermat’s last theorem! We will add a couple 
more elements to this subject in Chap. 6. You could also consult the texts 
of Hellegouarch [37] and Diamond and Shurman [27]. 


To state the following conjecture, we remind you of the existence of a 
bilinear form (-,-) coming from the Néron-Tate height. We will also need 
to define the real period of E: 


dx 
Qg = . 5.51 
_ Low 2y + a1x + a3 ( ) 


7.7. Conjecture. (Birch & Swinnerton-Dyer [14]) 


I) The order of vanishing of the function L(E,s) at s =1 is equal to the 
rank r = rank E(Q) of the Mordell-Weil group. 

IT) Let Py,...,P, be a basis for E(Q) modulo torsion and Qg the period 
of E. Then the leading coefficient of L(E,s) at s =1 is given by 


_ L(E,s) 
lim 


any uQe det (Fi, Pj) (5.52) 


where u € Q*. 


87. The L-function of an Elliptic Curve 203 


7.8. Remarks. In the complete formulation®, the number uw is explicitly 
defined. In fact, 
u=M [I elEQ)torl ”, 


p|Ag 


where c, > 1 is an integer which depends on the bad reduction at p and M 
is the cardinality of the “Tate-Shafarevich group”. This cardinality should 
be finite (it has only been shown in certain cases) and should be a perfect 
square (which is true if the group in question is finite). 


We point out that the conjectural formula is strongly analogous to the 
formula which gives the residue at s = 1 of the Dedekind zeta function 
in a number field K (see formula (6.34)). The Tate-Shafarevich group 
corresponds to the class group of ideals in K, the torsion group corresponds 
to the group of roots of unity, the Néron-Tate regulator corresponds to the 
regulator Rx of the units of K, etc. 


The first observation made by Birch & Swinnerton-Dyer is that, at least 
formally, L(1) = [I, x where N, := card E(F,) (more precisely, the 
Pp 


number of nonsingular points in the reduction modulo p of E). Now, N, 
is approximately equal to p with a variation of at most 2,/p. We denote 
this by Np = p+ 4(p),/p, and thus we have (still formally) L(1) = [],(1 + 
6(p)p—'/?)-*. If E(Q) is finite, we can imagine that N, oscillates regularly 
and hence that d(p) has the tendency to be well-distributed in the interval 
[—2,2], which would make the product converge. If now E(Q) is infinite, 
we might think that we would find more points modulo p and thus that 6(p) 
would have the tendency to be positive, which would imply the divergence 
of the product, more precisely, would force L(1) to be zero. 


We can think of the conjecture as a sophisticated version of the local/global 
principle. To see this, the function L(F, s) is constructed using fairly simple 
information of a local type, essentially the number of points modulo p, and 
it allows us, thanks to analytic continuation, to recover the rank of the 
group E£(Q). 

Finally, the sign of the functional equation of L(E,s) determines the parity 
of the order of the zero of L(E,s) at s = 1. Thus, conjecturally, the sign 
of the functional equation determines the parity of the rank of the group 
E(Q). This weakened version is called the parity conjecture. 


3The Birch & Swinnerton-Dyer conjecture is one of the Millennium Prize Problems; 
the Clay Mathematics Institute offers a million dollars for its solution. 


Chapter 6 


Developments and Open 
Problems 


“Une pierre 

deux maisons 
trois ruines 
quatre fossoyeurs 
un jardin 

des fleurs 

9 


un raton laveur’ 


JACQUES PREVERT 


The tone and the level of the prerequisites of this final chapter differ from 
the previous chapters. Here we will present a panorama—necessarily par- 
tial and one-sided—of some important research areas in number theory. 
In particular, every section contains at least one open problem. This last 
chapter also includes many statements whose proofs surpass the level of this 
book but which also provide an opportunity to combine and expand on the 
mathematics introduced in the preceding chapters. The chosen themes— 
the number of solutions of equations over a finite field, algebraic geometry, 
p-adic numbers, Diophantine approximation, the a,b,c conjecture and gen- 
eralizations of zeta and L-series—have all been introduced, either implicitly 
or explicitly, in the previous chapters. We will freely use themes from al- 
gebraic geometry and Galois theory, described respectively in Appendices B 
and C. 
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1. The Number of Solutions of Equations over 
Finite Fields 


In order to deepen your understanding of this section, you could first consult 


Weil’s original article [78] and Appendix C of [35]. 


The successes brought about by the introduction of the Riemann ¢ func- 
tion, then the Dedekind ¢x function, naturally lead to the study of following 
generalization. We consider a finitely generated ring A over Z or F,, in 
other words A := Z[t1,...,tn] = Z[X1,...,X,]/L or A:= Fplti,..-,tr] = 
F,[X1,...,Xn]/I. It is easy to see that if p is a maximal ideal in A, then 
A/p is a finite field. We denote by Np = card(A/p). Letting ZW, be the 
set of maximal ideals in A, we can therefore set 


¢a(s):= T] (Q-Npu)™. 


peda 


If A = Z, we recover the Riemann ¢ function, and if A= @x (for a number 
field K), we recover the Dedekind ¢x function. Furthermore, in the case 
where Z C A, every maximal ideal p contains exactly one prime number 
since pM Z is a non-zero prime ideal. If we denote by -@4,» the maximal 
ideals which contain p (this set is in bijection with the maximal ideals of 
A/pA), then “4 = Up@a,p, and we can write: 


ca(s)=][ J] G- Np’) = [[Gapals). 


Pp pE.a.p 


We can therefore, at least momentarily, concentrate on the case A = 
F,[t1,.--,tn] = Fp[X1,...,Xn]/I (we will come back to the case of va- 
rieties defined over Q or Z in the last section of this chapter). Let V 
denote the affine variety defined by the ideal J in A”. The maximal ideals 
of F,,[t1,...,tn] correspond to points of V(F,), and the maximal ideals of 
A=F,[X1,...,Xn]/I correspond to conjugacy classes under Gal(F,,/F,) 
in V(F,). We denote by |V| the set of these classes!. If  € V(F,) and 
if p is the corresponding maximal ideal in A, then Np = p*®() where 
deg(x) := [F,(x) : F,]. Since a point in V(Fpm) has a field of definition 
equal to F,2 where d divides m, we see that 


card V(Fpm) = S- deg (a). 
xeEl|V| 
deg(x) |m 


We thus obtain a second expression for ¢4(s): 


1In the language of Grothendieck schemes, we are talking about closed points of the 
scheme V = spec(A). 
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Ca(s) = II (1 —p- des(e)e) = exp ( card V(Fpm) f > (6.1) 


re|V| m=1 


By setting T = p~*, this leads to the following definition (where now V is 
not necessarily affine). 


1.1. Definition. Let V be an algebraic variety defined over Fy. Its zeta 
function is given by the formal series with integer coefficients: 


Z(V/Fq;T) = II (1 - Tes)) = exp @ card ve) | : 


xeE|V| m=1 
(6.2) 


In fact, if we write Z(V/Fq;T) = do nso9 @mT™, we can see that a, is the 


2 


number of formal linear combinations* m ,x; +---+m,2,, where m; € N 


and )*\_, m;deg(x;) =m. 


1.2. Examples. Let us compute this series for some varieties. 


e The calculation for the affine space A” of dimension n is simple: 


Z(A"/E,,T) = exp & ont) =(1-q"T). (6.3) 
m=1 
Since P? = ATU A”™-'U---L ATU A®, we have 
Z(P"/F,,T) = II (a@ry (6.4) 


e If V = G(n,k) is the Grassmannian which parametrizes the vector 
subspaces of dimension & in A”, we find positive integers Bj; such that 


= ' k(n—k) 
io eae 
card G(n, k)( i - an : = SS Bad’ (6.5) 
j=0 q’ i=0 
k(n—k) eS 
Z(G(n,k)/Fy,T)= [] (Q-@r)™”. (6.6) 
j=0 


For example, for V = G(4,2) (the space of lines in P*), we find 


2In more scholarly terms, we are referring to effective cycles of dimension zero and 
degree m. 
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card V(F,) = q4+¢° +2¢@? +441, and hence 


Z(V/Eq,T) = = =: 
(1-T)(1—qP)(1—-PT)P7(1—-@T)1-¢T) 


If E is an elliptic curve over Fy, we can deduce from Hasse’s theorem 
(Theorem 5-6.4) that there exists a € C with |a| = ,/q such that 


exp (S" t1—a™ ans 


m=1 


Z(E/Fq,T) 


I 


1—aT)1i-aT 
_ @-er\-ar) a 
(1—T)(1 — qT) 
We therefore realize that Z(E/F,,T) = (1-— aT + q7T’)(1—T)“‘(1- 
qI)~! where a = g+1—|E(F,)|. Hence, in this case, knowing |E(F,)| 
is equivalent to knowing Z(E/F,,T). 


Let V be a nondegenerate quadric in P”. Theorem 1-5.5, together with 
the Davenport-Hasse relation (see Theorem 1-5.15 and Exercise 1-6.25), 
gives 
Z(P* BT) if n is even, 
Z(V/F,,T) = nt 
Z(P"'/F,,T)(l—eq 2 T)7' if nis odd, 
n—-1 
where € = (>) 2 (4) and D is the discriminant of the quadratic 


form. 


Let V be the smooth intersection of two quadrics Q; = apxZ +--+ + 
Anz? = 0 and Qo = bor, + +++ +bnx? = 0 in P” where n is even. The 
computations done in Exercise 1-6.24, together with the Davenport- 
Hasse relation, yield 
ihe -1 
Z(V/Ep,T) = Z(P"?/Fp,T) TT (1—nv"PT) 
i=0 


where 7; = (+) 2 ( ot ) and D; := |] ,2;(biaj — aib;). 


d 


Let V be a Fermat hypersurface given by the equation agx@+:--+anx%. 


Theorem 1-5.13, together with the Davenport-Hasse relation, yields 
Z(V/Fq,T) = Z(P* 1 /Fy, TPT” 


where 
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P(T):= [J (1 (-1)" 7g *X0(a0) ++ Xn(an)G@(x0) ++ @(Xn)T). 


Here, S designates the set of (n+ 1)-tuples of characters different from 
the unitary character and such that the ver as well as yo---Xn, are 
equal to the unitary character. In particular, we have B,-1(d) := 
deg(P(T)) = ((d—1)"*1 + (—1)"*1(d — 1))d7! and the equality 

n—-1 


|a~*X0(ao) oe *Xn(@n)G(xo) Pie G(xn)| =q 2 
This thus yields the estimate 


n—-1 


|card V(F,) —cardP”“'(F,)| < Bn-i(d)q 2. (6.8) 


The following theorem (conjectured by André Weil [78]) lets us extend 
the previous inequality (6.8) to every smooth hypersurface of degree d in 
P”. We can verify its truthfulness in each of the previous examples, and its 
proof, which largely surpasses the level of this text, motivated specialists in 
algebraic geometry for twenty years. His work on developing the theory of 
algebraic geometry earned Grothendieck the Fields Medal in 1966. Deligne 
earned the same distinction in 1978 for completing this theory. 


1.3. Theorem. (Weil conjectures, Grothendieck’s and Deligne’s theo- 
rems) Let V be a smooth projective variety of dimension r. 
1) (Rationality) The function Z(V/F,;T) is a rational function in the 
indeterminate T. 


2) (Functional equation) There exists an integer x(V) and a signe = +1 
such that 
rx(V) 
) = 2 PUA V/PaT). (6.9) 


Z(V/F,; 
(vite te 


3) (Riemann hypothesis) There exist polynomials P;(T) € Z[T] such that 


POD Pea@) ys 
Z(V/F,;T) = =|][R(n-?”, 
VIPS T) = pay Py, (1) II wm 

and P,(T) = [1721 (1-47) with |ai,j| = 9'/?. 
4) Suppose that V is the reduction modulo a prime ideal of a variety V 


defined over a number field. Then the numbers B; are the topological 
Betti numbers of the variety V. 


1.4. Remarks. In particular, by 3) we have 
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The meaning of the last statement is the following. Let Y be a smooth 
projective variety defined over a number field K and having good reduction 
at a prime ideal p of Gx such that Ox /p = F, (see Appendix B). We can 
therefore consider the complex variety Y(C), to which we can associate 
the Betti numbers B; := dim H’(¥(C),Q). We can also consider the 
reduced variety modulo p denoted V/F,. Statement /) therefore says that 
the numbers B; = deg(P;(T)), given by the first part of the theorem, are 
exactly the Betti numbers of V(C). 


With this interpretation, we can see that y = ye. and it is there- 
fore legitimate to call the latter (€ Z) the Euler-Poincaré characteristic of 
the variety V. 


The reason why Statement 3) of Theorem 6-1.3 is called the Riemann 
hypothesis is an analogy. To be more specific, if we go back to the function 
initially associated to V/F4, it can be written 


2r fue 
(v(s) = Z(V/Fag*) =] RI)” 
1=0 


and the Riemann hypothesis—Statement 3) of the theorem—can be trans- 
lated into the assertion that the zeros (for odd i) or poles (for even i) are 


situated on the lines Re(s) = 5 


1.5. Remark. One of the most modern applications of Theorem 6-1.3 is 
the upper bound on the sum of exponentials, of which we studied a typical 
example given by Gauss sums. We could, for example, prove using these 
techniques the following estimate due to Deligne, where we assume that 
F(a) = Fa(a)+ Fu-1(a) +---+ Fi (x) with F; homogeneous of degree i and 
F, smooth, i.e., the hypersurface that it defines in P”~! is smooth, and 
also that p does not divide d: 


271k 
S- exp (5 (2) ) < Brap?. (6.10) 
re(Fp)” 


Before that, Weil proved these assertions over curves and deduced from 
them an upper bound for Kloosterman sums over integers modulo a prime p 
(see Exercise 1-6.18 for sums modulo any integers) which does not divide ab: 


S~ exp (ae) < 2p. (6.11) 


xeFs 
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The smoothness constraints are essential if we want the estimates to be fine. 
QWixyZ 


For example, the sum >>, y,z€F, XP equals 2p? — p, which is 


much larger than p*/?. Nonetheless, Theorem 6-1.3 does not outweigh the 
following more robust (and older) result: 


1.6. Theorem. (Lang-Weil [47]) There exists a constant C = C(n,r,d) 
such that for every closed (irreducible) subvariety V of P” of dimension r 
and degree d, we have the following estimate: 


|card V(F,) — card P"(F,)| < (d—1)(d—2)q""/? + C(n,r,d)q"—*. (6.12) 


1.7. Remarks. 1) Let P,(T) = P;(V/Fq,T) = []ji,(1 — ai;T) be one 
of the polynomials associated to V by Theorem 6-1.3. Let m; = m;(V) be 
the multiplicity with which a;,; = q‘/?. Since a; = q'/aij, we have 


PT) = I (1 = a r) ~ Tia.) — (ge) | 


qT 


Furthermore, we can easily see that This (-ai,;) = (—1)™q'Bi/2, which 
gives us a functional equation for P;(T) of the form: 


iB; 
_ m,> TBp{_1_). 
P,(T) =(-1)™q 2 T?+P, ( 7T ) (6.13) 


2) Moreover, by identifying the factors according to the absolute value of 
their roots in the functional equation, we obtain 


P,,_(T) = P;(q"~*T), (6.14) 


from which we can immediately deduce that m2,_; = m; and Bo,_; = B,. 
This yields the equality y/ := )77",(-1)#B; =r 2",(-1)'B; = rx. 


3) By referring to formulas (6.13) and (6.14), we recover the functional 
equation of the function Z(V, T) using the supplementary information that 
e=(-1)™. 


4) By using the fact that a symplectic isometry has determinant 1, we can 
prove that if r := dim(V) is odd, for example for an algebraic curve, then 
the sign of the functional equation of 7(V/F,, 7) is +1. Whenever r is even, 
the sign can be positive or negative and remains more or less mysterious. 
To conjecturally describe its behavior, Tate suggested to compare it to the 
rank of the group Num‘(V) of cycles of codimension i modulo numerical 
equivalence (see Appendix B, Definition B-2.9). 
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1.8. Conjecture. (Tate) The multiplicity mzi(V) is equal to the dimen- 
sion of the subspace generated by the algebraic subvarieties of codimension 
i modulo the numerical equivalence relation. 


In particular, the conjecture indicates that the sign of the functional equa- 
tion should be ¢ = (—1)"®"*Nu'(Y)_ Tate proved that it is always the case 
that rank Num'(V) < mo;(V), but we only know how to prove equality in 
certain cases, among which are the examples outlined previously in this 
section. 


2. Diophantine Equations and Algebraic 
Geometry 


In this section, we will consider the question—raised essentially by Serge 
Lang, [45] and [46]—of establishing a correspondence or dictionary between 
arithmetic and geometric properties of algebraic varieties. More concretely, 
if we are given a system of equations with integer coefficients 


ae fi(X1,.--,Xn) = +++ = fr(X1,---, Xn) = 9, 


we want to find connections between the qualitative properties (finiteness, 
density, etc.) of the rational solutions V(Q) and the properties of the an- 
alytic or algebraic variety of the complex solutions V(C). The existence 
of such a dictionary between algebraic, analytic geometry and arithmetic 
is very largely conjectural, but we nevertheless have some (deep) theorems 
and precise questions. You can consult [38] to acquire a deeper insight into 
this subject. 


We will start by describing the predicted geometric properties. 


To simplify things, we will assume in this section that the varieties are 
smooth and projective. By imitating differential geometry, we can give an 
algebraic definition of regular differential forms on an algebraic variety V 
as linear forms on the tangent space which are defined everywhere locally 
(i.e., for every point P on an open set containing the point P) by the form 


W = SS fidgi, 


where the fi,g; are algebraic functions on V without poles at P. We 
likewise define the space of regular differential k-forms as those which can 
be expressed locally as 


WW = S- Fix, sin Gi, N+ +> A dgi,- 
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2.1. Example. We will give a nontrivial example where the necessity of 
describing the differential form in various charts intervenes. Let E be an 
elliptic curve in P? given by the equation ZY? = X3+ aX Z? + bZ?. Let 
x= X/Z and y=Y/Z so that we can define 


dx _ 2dy ; 
y 327 + 4 


w= 


Since y and 32? + a do not simultaneously vanish (the curve is assumed 
to be smooth), the form w does not have a pole outside of the point at 
infinity. We can also see that it is regular around the point at infinity. 


By setting u := 1/2 and v := y/x?, the equation of the curve becomes 
v? = ut au? + bu, and we obtain 

dx du 2dv 

y ss 1 + 3au? + 4bu? 


which is clearly regular at the point (u,v) = (0,0). 


The set of regular differential k-forms forms a vector space denoted 0*[V]. 
Whenever V is projective, this space has finite dimension denoted by gz (V). 
The following invariant is particularly important. 


2.2. Definition. The genus of a smooth, projective algebraic variety of 
dimension r is the dimension of the space of regular differential r-forms: 


g(V) := dim Q"[V]. 


Observe that any two differential r-forms, w and w’, on V are “proportional” 
in the sense that there exists a function, f, such that w’ = fw. By choosing 
a basis, w1,...,W9(v), for 2"[V], this allows us to define the canonical map 
@y :V--- = PIV)! given by x (wi(z),-..,Wg(v)(z)). This map is 
rational (cf. definition and notation in Appendix B, page 274): it is only 
defined on the open set V \ Z, where Z is the locus of the common zeros of 
the w;. More explicitly, this map can be described as follows (at least on an 
open set): there exist rational functions f; such that w; = f;w1, and there- 
fore, the function @y can be written as @y(P) = (1, fo(P),-.--, focv)(P)) 
for P which are not poles of the f;. By considering the tensor powers of 
differential forms, i.e., expressions of the form w1 ®---®wm, we obtain the 
space of differential k-forms of weight m, the plurigenera g(m,V) which 
are the dimensions of the space of differential r-forms of weight m, and the 
pluricanonical maps ®,,y : V+.» > P9mV)-1, 


2.3. Definition. 1) The Kodaira dimension of a variety is —oo if g(m,V) = 
0 for every m and if not is given by 
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«(V) := maxdim ©, y(V). 
2) A projective variety is pseudo-canonical (or “of general type’) if K(V) = 
dim V. 


If V is a variety defined over a number field K, we can embed K into C 
and consider the complex points V(C); we thus obtain an analytic variety, 
i.e., a variety defined by holomorphic functions. 


2.4. Definition. A Riemann surface is a complex analytic variety of 
dimension 1. It is called algebraic if we can represent it as the set of 
complex points on an algebraic curve.? 


It is fairly easy to show that if V is a projective curve, then V(C) is 
compact. The converse is a deep theorem of Riemann: every compact 
Riemann surface is algebraic (and projective). 


2.5. Examples. 1) The genus of a smooth projective curve can be any 
natural number. We have g = 0 if V = P! (since 2'[P'] = 0) and g = 1 if 
V is an elliptic curve (a basis for 0'[V] is given by w := dr/y); a smooth 
projective plane curve of degree d has genus g = (d — 1)(d — 2)/2. If V 
is defined over C, then V(C) is a compact Riemann surface, and g(V) 
coincides with the number of handles or holes in the surface. 


g>2 
V(C) =U/T 


3We would like to point out the classical naming conflict of calling the same object a 
“curve” and a “surface”. 
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For a smooth plane curve V of degree d, we find that V is isomorphic to 
P! if d=1 or 2; that g = 1 if d = 3, and therefore, V(C) & C/A (ie., V 
is an elliptic curve, see Chap. 5); and finally, that g > 3 if d > 4. 

2) If now V is a smooth projective hypersurface in P” of degree d, then 
the dimension is given by 


—oo ifd<n, 
K(V) = 40 ifd=n+l1, 
dim(V) ifd>n+2. 


One of the deepest results linking the geometry of curves to their arithmetic 
properties was proven by Faltings in 1983 (see [16], [30], [38]). Faltings 
received the Fields Medal in 1986 for this work. 


2.6. Theorem. (Mordell conjecture, Faltings’s theorem) Let C be a curve 
of genus g > 2 defined over a number field K. Then C(K) is finite. 


This theorem completes the prior results of Siegel dating back to 1929 (see 
Theorem 5-4.1). In order to give a geometric statement of this theorem, it 
will be convenient to use the following notation. 


2.7. Definition. Let C be a smooth projective curve of genus g and T 
a finite set of points. We denote by U = C \ T the corresponding curve 
(which is affine if T 4 0). The Euler-Poincaré characteristic of U is defined 
by x(U) = 2-29 |TI. 


2.8. Theorem. (Siegel’s theorem) Let C be a smooth projective curve of 
genus g defined over a number field K and T a finite set of points. We 
denote by U = C'\T the corresponding affine curve. If y(U) <0, then the 
set of integral points U(@x) is finite. 


Siegel’s theorem was generalized to S-integral points by Mahler. If g > 
2, Siegel’s theorem is surpassed by Faltings’s theorem, and if g = 1, we 
essentially obtain Theorem 5-4.1. If g = 0 and |T| > 3, the statement is 
equivalent to the unit theorem (Theorem 5-4.4), which affirms, for example, 
that the curve given by the equation xy(y—1) = 1 only has a finite number 
of S-integral points. 


For the proof, see for example [38]. To illustrate the importance of Theo- 
rem 6-2.6, we will restate it as part iz) of the following theorem. If V isa 
projective subvariety of P”, we set: 
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N(V(Q), H, B) := card{x € V(Q) | H(z) < B}. 


2.9. Theorem. Let V be a projective variety defined over Q. Then we 
have the following asymptotic estimates as B tends to infinity. 


i) IfV =P”, then 
n ile n+1 
N(P"(Q), H, B) ~ —2-— B"*!. 
(P"(Q), HB) ~ 
ti) If V = E is an elliptic curve of rank r = rank E(Q), then 


N(E(Q), Hi, B) ae cr (log B)"/?. 
ii) If V is a curve of genus > 2, then N(V(Q), H, B) becomes constant for 
large enough B. 


2.10. Remark. We underline the fact that from an arithmetical point of 
view, we have the following trichotomy of curves, V: 
1) genus 0 curves with many rational points and «(V) = —oo; 
2) genus 1 curves with few rational points and «(V) = 0; 
3) genus > 2 curves with a finite number of rational points and «(V) 
maximal. 


It is this trichotomy that the Lang conjectures are trying to generalize. 
Proof. Point iit) is a reformulation of Theorem 6-2.6. To prove 7), we 
introduce the functions F(B) := card{a € Z"*! | 0 < max|a;| < B} and 
G(B) := card{z € Z"*! | 0 < max|z;| < B, gced(ao,...,%_) = 1}. We can 
see that F(B) = (2|B|] +1)"*! —-1 = (2B)"*! + O(B"). By regrouping 
the elements of Z"*! according to the gcd of their coordinates, we see that 
F(B) = ¥° G(B/d). 
d<B 
By using the Mobius formula (see Exercise 4-6.5), we obtain 


G(B) = )/ u(d)F(B/d) 


d<B 


_ n+1 u(d) n 
_ (2B) qrti + O B >; 
d<B d<B 


HL — (dd) 
qrti 


1 
a” 


= (2B)" + O(B" log B) 


d=1 


grt nm n 
= tay +1 + O(B" log B), 


where the term log B can be omitted if n > 2. 
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To prove ii), we use the quadraticity (see Theorem 5-2.2.2) of the (logarith- 
mic) height on an elliptic curve. If ||.|| is a Euclidean norm on R’, we will 
prove that card{x € Z” | ||z|| < X} ~ cX” using the following argument. 
Let y = max, ¢(0,1)r ||a||. Then we have the inclusions 
{wER"; |lz||<X-y¥C [J c+ [0,1 c{reR’; ||a|| < X+4}. 
xeEZ" 
zl] < X 


By considering the volumes (where the volume of the ball || - || < 1 is 
denoted by v,.), this yields: 


ur(X —y)" <card{a € Z”; ||az|| << X} <u,(X +7)", 


and hence card{x € Z” ; ||a|| < X} ~ u,X". By applying this to the 
Néron-Tate height h : E(Q) — R and its associated real quadratic form 
hp: E(Q) @R— R (see Chap. 5), we obtain 


card{P € E(Q) | h(P) < X} 
= |E(Q)torl [fe € (Q) ®R | hax) < X}| ~ ex"??, 
By choosing H:= exp h, we thus have 
N(E(Q), H, B) ~ en(log BY”, 


and we can easily see that the estimate still holds if we replace H by H 
since for every P, C~'H(P) < H(P) < CH(P). 


We will now consider an algebraic curve over the field of complex num- 
bers. As we have seen, its complex points form a Riemann surface. Sim- 
ply connected Riemann surfaces were classified by Riemann. There are 
three of them: the sphere or projective line P!(C), the affine plane or 
line C = P1(C) \ {co} and the unit disk U := {z € C | |z| < 1}. The 
universal covering of a projective curve of genus 0 (resp. genus 1, resp. 
genus > 2) is the projective line (resp. plane, resp. disk). Therefore, the 
corresponding analytic variety is either P!(C), C/Q (where Q is a lattice 
in C) or U/T (where T is a discrete subgroup of Aut(U)). The parallel 
notion in complex geometry can be underlined using the following observa- 
tion based on Picard’s theorem, which states that an entire, non-constant 
function f : C > C takes all complex values except for at most one (think 
about the exponential function). We will also use the following topological 
property of the universal covering: if 7 : S — S is the universal cover- 
ing of S, every holomorphic map f : C — S can be factored through 
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am, in other words, there exists a holomorphic map . : © > S such that 


f=nof. 


2.11. Proposition. Let S be an algebraic Riemann surface, in other 
words, an algebraic curve of genus g minus s points, and let y(S) := 2— 
2g —s be its Euler-Poincaré characteristic. There exists a non-constant, 
holomorphic map f :C — S if and only if x(S) > 0. 


Proof. If g = 0, then S = P'(C) \ {P,,...,P;}, and by Picard’s theorem, 
a non-constant, holomorphic map can only exist if s = 0,1 or 2, ie., if 
S=]PC) oS =P) Vie t=Cors=P(C)\iP, tHe V0 
If g =1, then S = E(C) \{P,..., Ps} where E(C) = C/Q. If s =0, we 
have a holomorphic map from C > C/Q, but if s > 0, we would obtain, 
by lifting the map to the universal covering C, an entire function which 
does not take infinitely many values and is therefore necessarily constant. 
Finally, if g > 2, every holomorphic map f : C — S'can be factored through 
the universal covering, which is the disk, and hence f is constant. 


This suggests the following definition, due to Brody. 


2.12. Definition. A complex analytic variety X is hyperbolic (in Brody’s 
meaning) if every holomorphic map C — X(C) is constant. 


With this definition, we can see that projective curves which are hyperbolic 
are exactly those of genus > 2, in other words, those for which the finiteness 
of the number of rational points was proven by Faltings. Affine curves which 
are hyperbolic are those of genus > 1 or of genus 0 with at least three points 
at infinity, in other words, those for which the finiteness of the number of 
integral points was proven by Siegel. 


In order to take into account the case where the images of holomorphic maps 
are contained in a subvariety, Lang introduced the following definitions. 


2.13. Definition. 1) Let X be an algebraic variety defined over C. The 
analytic special set is the closure (for the usual topology) of the union of 
the images of non-constant holomorphic maps f : C — X(C). 


2) Let V be an algebraic variety. The algebraic special set is the closure 
(for the Zariski topology) of the union of the images of non-constant maps 
from an algebraic group to V. 


Let us point out that since P! is the image of, for example, an elliptic 
curve, the special set contains all of the rational curves. 
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The idea of the conjecture, due to Serge Lang, is to attempt to generalize 
the good dictionary between arithmetic, algebraic and geometric properties 
to varieties of higher dimension. 


2.14. Conjecture. (Lang [45]) Let V be a projective algebraic variety 
defined over a number field. 
1) The following three properties are equivalent. 
i) The variety V has a finite number of rational points over every 
number field. 
ti) Every subvariety of V (including itself) is pseudo-canonical. 
tit) The analytic variety V(C) is hyperbolic. 
2) The complement of the algebraic special set of V has a finite number 
of rational points over every number field. 


This conjecture is therefore a theorem if dim(V) = 1, thanks essentially to 
the result of Faltings. Here is a very concrete open problem: is it true that 
the set of rational points over Q of the surface V, defined in P® by 


Xo + XP + Xz + Xz =0, 


lie on a finite set of curves? For example, the lines X;+ X; = X,+Xe =0 
lie on V; are there other curves having infinitely many rational points? 


We could state a variation, started by Lang and completed by Vojta [75], 
of this conjecture concerning the integral points on affine varieties. To do 
this, it will be convenient to use the following definitions and conventions. 
We will always be considering an affine variety U C A” as a projective va- 
riety V C P” minus a hyperplane section D:= VOH where H := P”\ A” 
is the hyperplane “at infinity”. We will assume that D has “normal cross- 
ing’, which means that D is the union of r irreducible smooth components 
D,, Do,...,D, and the D; intersect transversely. In particular, 


dim D;,N...D;i, < dimV — s, 


and the tangent spaces intersect with the same dimensions. Hironaka’s 
desingularization theorem tells us that an affine variety can always be rep- 
resented as such. 


If r = dimU = dim V, we will now look at the differential r-forms on V 
which are regular on U and which have at most a simple pole along D. 
We denote by Q(V)[D] the vector space of these forms, by g(m, V, D) its 
dimension and by ®,,yp:U--: = p9(™.V.P)-1 the induced map. The 
logarithmic Kodaira dimension is therefore defined as 


K(V, D) := maxdim ®», v.p(U). 
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In fact, with the given conventions, this integer only depends on U. The 
space U is said to be log canonical if K(V, D) = dimU. 


2.15. Conjecture. (Lang-Vojta) Let V be a smooth projective algebraic 
variety defined over a number field, D a hyperplane section with normal 
crossing and U := V \ D the corresponding affine variety. 
1) The following three properties are equivalent. 
i) The variety U possesses a finite number of S-integral points for 
every number field and finite set of places S. 
ti) Every subvariety of U (including itself) is log canonical. 
iit) The analytic variety U(C) is hyperbolic. 
2) The complement of the special set of U possesses a finite number of 
S-integral points for every number field and finite set of places S. 


This conjecture is equivalent to Siegel’s theorem in the case of curves. 
Very few cases are known in dimension > 2. For example, according to the 
conjecture, the surface given by the equation 


-l1-a*+y*+24=0 


should have a finite number of integral points outside of a finite number of 
curves, such as the line x — y = z —1 = 0 for example. 


We will finish with an example due to Vojta which illustrates the necessity 
of the “normal crossing” hypothesis. Consider, in the projective plane with 
coordinates (x,y,z), the hyperplane section D composed of two lines D,; 
and Dz whose equations are given by « = 0 and y = O and the conic 
D3 given by 2(@ — y) — («© + y)? = 0. We point out that the divisor 
D = D,+ D2 + D3 does not have normal crossing, since D, 1 DzN D3 = 
{(0,0,1)}. If it did have normal crossing, the Lang-Vojta conjecture would 
predict that the S-integral points are not Zariski dense. For U := P? \ D, 
the algebra of coordinates of U is generated by the functions f; = 


z y z Ay? 
=+,fg=—,fs= and fs = 
1 aide gale ts Gane, 
points are therefore the points where the functions take S-integral values. 


Ifk € Zand € € OF «, we define the point 


Ate A) 
Pre = | €,l,e+ 3 — ———_ ]. 


x 
] o] 
- The S-integral 


e—1 


We can check that fs(Pr..) = —e~", and the points P,. are thus all S- 
integral. If Oj g is infinite, it is clear that the points form a dense set in 
the plane for the Zariski topology. 
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3. p-adic Numbers 


The technique of going from discrete to continuous—by embedding Z or Q 
into R—is classical. There exist completions other than R. In fact, there 
exists exactly one, up to isomorphism, for each prime number p. These 
completions, far from being exotic, are actually more rich in arithmetic and 
topological content. We will briefly describe them here, and we recommend 


the following texts for further study: [2] and [8]. 


3.1. Definition. A p-adic integer is an equivalence class* of sequences 
xv := {x0,%1,..-,%n,-..} of integers such that Vn, t, = 7,41 modp”. Two 
sequences are equivalent if x, = z/,modp”. 


We can also write the sequence of integers in the form {ag, dg + ayp, a9 + 
ap + agp”,...} and, if we want to, bound it by taking the integers a; to 
be in [0,p — 1]. This suggests the following notation for a p-adic integer, 
“x = SO, aip’”, to which we will soon give a more precise meaning. 


We can naturally define the operations of sum and product, which endow 
the p-adic integers with a ring structure, denoted Z,. Divisibility is partic- 
ularly simple. 


3.2. Lemma. The ring Z, is integral. Furthermore, it satisfies the fol- 
lowing properties. 


t) ZF = {x = {Xn}n>0 | Lo F Omodp}. 
ii) Every non-zero element can be written uniquely x = p™u, wherem EN 
and u € Zs. 


Proof. Let x = {2n}n € Z,. Since r, = 29 # 0modp, the integers x, are 
relatively prime to p, and we can choose integers z/, such that x/, is the 
inverse of z, modulo p”. Therefore, x}, = v,,,,modp”, and 2! := {z/}n 
thus defines a p-adic integer such that xa’ = 1. If = {an}n € Zp, we 
set m := max{n | #7, = Omodp"}. Then 2,44 = p”™uz, where p does not 
divide ux. The factorization given in the statement follows from this. 


We denote by m = ord,(x) the maximal power of p which divides z. 


3.3. Definition. We denote by Q, the field of fractions of Z,; it is called 
the field of p-adic numbers. 


4In more scholarly terms, we could define the p-adic integers as Zp = lim Z/p"Z. 
n 
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We point out that every p-adic integer is congruent mod p” to a natural 
number, ie., Z,/p"Z, = Z/p"Z. We will now introduce a topology which 
will make it clear that Z, is a completion of Z. 


3.4. Definition. We define the p-adic absolute value by |a|) = p~ °° 
(and |0|, = 0). We say that the sequence (u,,) tends to if lim, |un—|, = 0. 


3.5. Lemma. For the p-adic topology, the following properties hold. 


i) The closure of Z is Zp, which is compact. The field Q, is locally com- 
pact. 

ti) A sequence (Un) € Zp converges if and only if limy,(Un41 — Un) = 0. 
Likewise, a series )),, Un converges if and only if limyn Un = 0. 


Proof. Let « = {&n}n € Zp. Then |x —2,| <p”, and the sequence of 
integers therefore converges to x. Next, Qn = Um>op” "Zp. The map x 
(cmodp”)n>1 from Z, to Last Z/p"Z is injective and continuous, and its 
image is closed. The compactness of Z, follows from the compactness of 
the product NTs Z/p"Z. The second assertion comes from the ultrametric 


inequality |uag + +--+ un|p < maxu<n<n{|Un|p}- 


3.6. Examples. Ifa, € Z, then the series }7,, a,t” converges for |t|, < 1, 
in other words, for t € pZ,. Thus, the series }>,,a,p"™ indeed defines 
a p-adic number (it is the analogue of the decimal expansion of a real 
number). The “logarithm” series, >7,,5, LS, also converges for |t|, < 1, 
because ord,(t”/n) = nord,(t) — ord,(n) > nord,(t) — logn/logp. The 
n 
“exponential” series, )°,, converges if ord,(t) > 1/(p— 1) or |tlp < 
1 
p P-1. In fact, 


ordg(nt) = | 4-4] a... cn (F poua 7 pe) =o 


3.7. Theorem. Let F € Z[X1,...,Xn]. The following statements are 
equivalent. 

i) Vm, Aa € Z” such that F(x) =0mod p™. 

wt) Ja € (Z,)” such that F(x) =0 (in Z,). 


ni? 


Proof. If x € (Z,)” satisfies F(x) = 0, then x is congruent modulo p™ 
to an n-tuple of integers. Conversely, if we had 2°) € Z” such that 
F(a) = 0mod p™, then we could extract a sequence such that £0"+1) = 
z(™ mod p™. We could therefore define, in the p-adics, x = lim,, x” 
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and would then have F(x) = F(x°”)) = 0modp™ for every m, and hence 
F(a) =0. 


We denote by VF (2) := (Ew, sok ZF (w)) the gradient of F'. We 
1 n 


will now introduce the p-adic analogue of Newton’s method for finding the 
zeros of functions or polynomials. 


3.8. Theorem. (Hensel’s lemma) Let F € Z,[X1,...,Xn], 6 > 0 and 
Xo € (Zp)” such that: 


i) F(xo) = Omod p**+1, 
ti) VF (20) =0mod p®, but VF (ao) # 0Omod p®*?. 


Then there exists x € (Zp)” such that x = zomodp**! and F(x) =0. In 
particular, a smooth point on the hypersurface F = 0 modulo p lifts to Zp. 


Proof. With the notation given in the statement, we can write F'(2o) = 
p>*1a where a € Zp, VF (xo) = p®b, b € (Zp)” and b#0mod p. Then we 
have 


F(x + p?t1u) = F(2o) + p°*1V F(ao) -u = pt! (a + b- u) mod p?*?, 


This yields a solution x1 = a9 + p°t!u such that F(x21) = Omod p**+? 
as soon as a+ b-u = Omodp, which is possible because b ~ 0modp. 
By iterating this procedure, we obtain a sequence (X,) where %m41 = 
2m mod pot™+! and F(am) = 0mod p?*+™*!, The sequence therefore con- 
verges in Z, to x, and since F(x) = Omodp”™ for all m, we have indeed 
found x such that F(x) = 0. 


3.9. Example. The simplest application of this lemma is to a polynomial 
P € ZX] such that P(a9) = Omodp but P’(ao) 4 Omodp. Hensel’s 
lemma gives us a way to construct a root a € Z, of the polynomial P such 
that a = ag mod p. 


3.10. Remark. This theorem, together with the result of Lang-Weil 
(Theorem 6-1.6), allows us to find an algorithm for deciding if an equa- 
tion is solvable mod N for every integer N or, in the same fashion, if it is 
solvable in Z, or Q, for all p. To see this, take the case of a polynomial 
FeEZ|[X,...,X,], which we will assume to be irreducible. The Lang-Weil 
estimates show that the equation modulo p possesses roughly p”~! solu- 
tions, while the number of singular solutions is O(p”~7). For large enough 
p, there will be a nonsingular solution modulo p and hence, by Hensel’s 
lemma, a lifting to Z, of this solution. For a given p, the previous theo- 
rem essentially provides an algorithm which tells us that either there exists 
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a solution in Z, or there exists 6 such that the equation is not solvable 
modulo p°. 


3.11. Remark. Hensel’s lemma allows us to specify the structure of the 
group of p-adic units U := Z>. For this, we will introduce the subgroups 
Um = {x €U | x =1modp”}. 


3.12. Lemma. Let p be odd. There exists a unique subgroup bp—1 C ZF, 
isomorphic to U/U, = Fy. Ifm > 1, there is an isomorphism Um/Um+1 = 
Z/pZ. In particular, as topological groups, 


Zt = Z/(p—1)Z x Zp. 


If p = 2, then Z5 = {+1} x Ug S Z/2Z x Zo. 


Proof. The proof immediately follows from Hensel’s lemma: the solutions 
of z?~! = 1modp can be lifted to Zp, and the map x > 1 + px induces a 
bijection from Z, to U;, then an isomorphism from U;/U2 to Z/pZ, which 
proves the second part of the statement when p 4 2. The case p = 2 
can be treated similarly. We could also notice that the “logarithm” map 
U; — pZp, given by 1+ pr > > (—1)"*!p"x" /n, and the “exponential” 
map pZ, — U, provide the desired isomorphism. 


Using Hensel’s lemma, we can also completely study the p-adic squares. 


3.13. Lemma. The squares in Qi), can be described as follows. 


i) If p is odd, any unit u € Z,, where u = 1modp is a square. Further- 


more, (Q; : Q;”) = 4, and representatives of the classes are given by 


1,¢,p,ep where € is not a square modulo p, i.e., (<) =-l. 
ii) If p = 2, a unit u © ZS with uw = 1mod8 is a square. Also, (Q3 : 


4?) = 8, and the representatives are {+1, +2, +3, +6}. 


Proof. Consider the equation F(x) = 22 —u = 0. For p odd, if u = 


lmodp, then F(1) = Omodp and F’(1) = 2 # Omodp. More generally, 
if u = v? modp, then F(v) = 0modp and F’(v) = 2v #0modp. Hensel’s 
lemma therefore gives an x € Z, such that 2? = u. We can thus see that 
a p-adic number y = pu (where m € Z and u € Z>) is a square if and 
only if m is even and u is a square modulo p. If now p = 2, as soon as we 
have x9 # 0mod 2, where F(a9) = 0mod 2°, we can apply Hensel’s lemma 
(with “d” equal to 1) and deduce that u is a square. The only remaining 
point to check is the congruence x? = umod8 for odd u. 
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In order to study quadratic forms over Q,, we can, as with a field of 
characteristic 4 2, reduce to the case of diagonal forms, a,x77 +--+ + ana? 
with a; € Z,, and then, by factoring px” = (px)?, reduce to the case where 
ord,(a;) = 0 or 1. To summarize, it suffices to study forms of the type 


Q(x1, 83 52} = Qi(“1, os% 3) + pQo(%s41,- oe ,In) 
= a0) + +++ asx? + p(as412244 s+ ane?) 


where a; € Zi. We can easily see that there exists x 4 0 such that Q(x) = 0 
if and only if there exists a nontrivial zero of Q; or Q2. Hensel’s lemma 
yields the following result. 


3.14. Lemma. Let p be an odd prime. The equation ayxj + a22? +4323 = 
0, where a; € Zi, has a nontrivial zero in Qy. 


Proof. If p 4 2, we can find a nonsingular point mod p on the conic, which 
lifts to a p-adic point. 


3.15. Corollary. Any quadratic form in n > 5 variables has a nontrivial 
zero in Qn. 


Proof. If p is odd, we can write Q(x) = aya) +--+ + 4502 + p(ds41024, + 
--» + @,27) where a; # 0modp. Since either s > 3 or n—s > 3, the result 
follows from the previous lemma. If p = 2, the proof can be deduced from 
the lemmas and remarks below. 


3.16. Remark. Quadratic forms are thus a little bit more complicated 
over Q2 as shown in the following example—essentially treated in the study 
of sums of squares in Chap. 3. The quadratic form Q(a, y, z,t) := 27 +y?+ 
2? — 7t? does not have any nontrivial zero in (Q2)*. We can nevertheless 
prove, for example, the following lemma, whose proof is also based on 
Hensel’s lemma and left to the reader. 


3.17. Lemma. a) Let Q(x) = ayx7 +--+: +as5x? with a; € Z5. A primitive 
solution to Q(x) = 0mod8 can be lifted to Z5. 


b) Let Q(x) = aya? +--+ + asa? + Qdgpree yy ++++4+2a,2? with a; € Zs. 
A primitive solution to Q(x) = 0mod 16 can be lifted to Z4. 


The following theorem is an archetypal local-global arithmetic theorem. 
3.18. Theorem. (Hasse-Minkowski) If Q(x) = )0, ; 4i,jvit; ts a quadratic 


form with rational coefficients, then it has a nontrivial rational zero if and 
only if it has a nontrivial zero in R. and in every Qp. 
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For the proof, see [2] or [8]. In reference to this theorem, a class of varieties 
defined over Q is said to “satisfy the Hasse principle’ if the existence of a 
real point and a p-adic point for every p implies the existence of a rational 
point. We can also reformulate the Hasse-Minkowski theorem as saying 
that the quadrics or hypersurfaces of degree 2 satisfy the Hasse principle. 


3.19. Corollary. The quadratic form Q(x, y,2,t) = x? + y? + 27 — mt? 
has a nontrivial rational zero if and only if m is not of the form 4°(8b+ 7). 


Proof. We apply the preceding theorem by observing that the quadratic 
form Q(x, y,z,t) always has, by Hensel’s lemma, a nontrivial zero in Q, 
for odd p. It also has a nontrivial real zero by assumption. Finally, it has a 
nontrivial zero in Qg if and only if the condition from the corollary below 
is satisfied. 


3.20. Corollary. A quadratic form in five variables Q(x, y,z,t,u) has 
a nontrivial rational zero if and only if it is neither positive-definite nor 
negative-definite. 


The Hasse-Minkowski theorem cannot be generalized to hypersurfaces or 
varieties of higher degree. Exercises 1-6.23 and 3-6.24 give examples of 
equations having (nontrivial) solutions in R and in each Q, but none in 
Q. If we examine the case of cubic hypersurfaces, for which there is always 
a real zero, we know however how to prove the following result. 


3.21. Proposition. (Lewis [52]) A cubic hypersurface F(a1,...,U%) =0 
has a nontrivial zero in Qy whenever n > 10. There exist cubic hypersur- 
faces with no nontrivial zeros in Qy when n = 9. 


3.22. Theorem. Let F(21,...,2n) = 0 be a smooth cubic form with 
coefficients in Z. 
1) (Heath-Brown [36]) If n > 10, the form F has a nontrivial rational 
Zero. 
2) (Hooley [39]) Ifn = 9 and if the form F has a nontrivial p-adic zero 
for every p, then it has a nontrivial rational zero. 


It is actually fairly easy to construct cubic forms in 9 variables without any 
nontrivial zeros in Q,. We will start off with a cubic form in three variables 
over F,, having only the trivial zero in F,. It suffices to take, for example, 
Go(x,y, 2) = N(a+ yw + zw”) where 1,w,w? is a basis for F,s over F, and 


N= Nee We then take a form G(x, y, z) with coefficients in Z, such that 
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Gmodp coincides with Go, and we set 
F(x, tee , £9) = G21, %2, £3) + pG(@4,25, x6) + p’G(x7, 2g, Xo). 


If F had a nontrivial zero x € (Z,)°, there would exist one such that x # 
Omod p. But by reducing modulo p, we would have G(x, x2, 73) = 0 modp, 
hence p would divide #1, 22 and x3. By reducing modulo p?, we can infer 
that pG(x4, 25,26) = 0mod p”, hence p would divide x4, 25 and x6. Finally, 
by reducing modulo p?, we would have p?G(x7, xs, 79) = 0 mod p®, hence p 
would divide x7, xg and x9, which yields a contradiction. 


The following statement is part of the folklore. 


3.23. Conjecture. 
1) A cubic form in 10 or more variables represents zero over Q. 
2) A cubic form in 9 or more variables represents zero over Q if and only 
af it represents zeros over every Qp. 


We know, by Davenport, that a cubic form in 16 or more variables rep- 
resents zero over Q. An optimistic version of the conjecture would be to 
replace 9 by 5 in 2). However, there do exist forms in 4 variables which 
contradict the Hasse principle. One of the first counterexamples is due to 
Cassels and Guy: 

5a? + 12y* + 92? + 10¢° = 0. 


Here is an even simpler one, due to Birch and Swinnerton-Dyer (see [13]): 
—5x* + 22y? + 22° + 4w® — 6zw(x + 4y) = 0. 


The proof of the nonexistence of (nontrivial) rational solutions is given in 
Exercise 3-6.24, where this equation is written (with a := W2) in the form: 


NG (a + 4y + za + wa’) — 6(a + y)(x? + xy + Ty?) = 0. 


3.24. Remark. No difficulties arise when generalizing what we have pre- 
sented to p-adic completions of Q in the case of a number field K equipped 
with a non-zero prime ideal p. We define the p-adic absolute value by 
|a|p = Np~ od») and the ring of p-adic integers (resp. the field of p-adic 
numbers) by: 


Oy := lim (Ox/p”) , (resp. Ky = Frac(@y)). 


3.25. Remarks. Adeles and ideles. We can regroup the p-adic comple- 
tions into a global object which proves to be very interesting. We denote 
by Mx, as in Chap. 5, the set of places of Kx, i-e., the union of the set Mz 
of Archimedean places (real embeddings and pairs of complex embedding) 
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and the set of (non-zero) prime ideals of Gx. For every finite set S of places 
of K which contains the Archimedean places, we set 


AS = [= x II Or. 
ves v¢eS 


These sets are endowed with the product topology, and their union therefore 
also inherits a topology. 


3.26. Definition. The ring of adeles of K is the ring 


Ag =|JAk. 
Ss 


We can also define the adeles as the set of sequences © = (%y)vem, such 
that x, € K, and for every v € Mx, except for a finite number of them 
(dependent on x), we have x, € @,. The field K is embedded diagonally 
in the adeles, and one important property is that A/K is compact. Every 
local field K, is also embedded in the adeles by 2, + (0,...,0,2y,0,...). 


3.27. Definition. The group of ideles of K is the group 


Ix =(JJ% = (AR) = Axe. 
S 


Ss 


Ideles are naturally endowed with the topology inherited from the product 
topology on J. = [] ,¢5 K5x Hues @x,- We should however point out that 
this topology is different from the topology induced by the inclusion Jz C 
Ax. We can also define the ideles as the set of sequences x = (2) vem 
such that x, € K} and for every v € Mx, except for a finite number of 
them (dependent on x), we have x, € @%. Every finite extension, L/K, 
has a natural “norm” map: 


Nk iJ 7 Ik. 


The multiplicative group of the field K* is embedded diagonally in the 
ideles. Every multiplicative group K* is also embedded in the ideles by 
Ly + (1,...,1,2y,1,...). We have a numerical norm on the ideles defined 


for = (Ly)vemMg by 
belize] Jails: 


vEMK 
The kernel of the norm contains K* (cf. Theorem 5-2.1.5) and is denoted 
J°.. An important property of J9-/A™* is that it is compact. 


By considering the map from Jx to the fractional ideals of kK which asso- 
ciates to (%y)vem, the ideal Ti pdr ®» it can be shown that we have an 
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isomorphism: 


J [Jie K* = Cl. (6.15) 


In fact, the compactness of J9./K* is equivalent to the combination of the 
finiteness of the class group and Dirichlet’s unit theorem. 


4. Transcendental Numbers and Diophantine 
Approximation 


A proof that a number, such as mw or e, is transcendental resembles, at 
least formally, a proof in Diophantine approximation. This classic theme 
is expanded on in Baker’s book [1] and in [12]. We offer you here a first 
taste of this theory. 


It can be said that the theory of Diophantine approximation and tran- 
scendental numbers starts with Liouville’s result, which says that if a is an 
algebraic number of degree d := [Q(a) : Q] > 1, then there exists a constant 


C = C(q) such that for every rational number, we have lo - 2 > — 
q 


We have seen in the proof of Siegel’s theorem (Theorem 5-4.1) that it was 
essential to improve on the exponent d. The first result in this direction is 
due to Thue (1909). 


4.1. Theorem. (Thue) Let a be an algebraic number of degree d = [Q(a) : 
Q| >1 ande>0. There exists a constant C = C(a,¢) such that for every 
rational number p/q, we have 


la r\> ~. (6.16) 
z= +1+e 
q 2 


The proof is sketched further down. Successive improvements are due to 


Siegel (1921), who showed that the exponent 4 + 1+ can be replaced 


by 2d + ¢€ (which is an improvement on Thue’s theorem when d > 12); 
Gel’fond and Dyson (1947), who proved that the exponent can be replaced 
by /2d + 6 and finally Roth (1954), who proved a result which is essen- 
tially optimal considering Dirichlet’s theorem (Corollary 3-3.8). The Fields 
Medal was awarded to Roth in 1958 for this proof. 


4.2. Theorem. (Roth) Let a ¢ Q be an algebraic number and € > 0. 
There exists a constant C = C(a,e) > 0 such that for every rational number 
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p/q we have 


C 


2+e€ , 
q 


Dp 
Jo - FI > (6.17) 


One of the most important theorems which is purely about transcendence 
is due to Baker (1966). It concerns linear forms of logarithms. The Fields 
Medal was awarded to Baker in 1970 for this result and its numerous ap- 
plications. We denote by log a a complex number @ such that exp(3) = a; 
for example, we can write 2i7 = log 1. 


4.3. Theorem. (Baker) Let a1,...,Q, be non-zero algebraic numbers. If 
log a1,..., log a, are Q-linearly independent, then the numbers 1, log ay,..., 
logan are Q-linearly independent. 


4.4. Remark. We can recover as a corollary to this theorem a certain 
number of classical results on transcendence. 


i) The number ¢ is transcendental (Hermite, 1873); this is true because if 
it were algebraic then the number 1 = loge would be transcendental. 
ii) The number 7 is transcendental (Lindemann, 1882). This is true be- 
cause the number 277% = log 1 is transcendental. 
iii) If a€ Q\ {0,1} and 8 € Q\Q, the number ¥ := a? is transcendental 
(Gel’fond and Schneider, 1934). This is because if it were algebraic, 
since log y — Glog a = 0, we could deduce that 6 € Q. 


Further down, we will give (Theorem 6-4.15) a quantitative version of 
Baker’s assertion, which has become a fundamental tool in the study of 
Diophantine equations. 


Schematically, the proofs of all of these theorems follow the same pattern 
as the proof of Liouville’s result (see [12] for a more complete picture). 


e ist step. We start by constructing a polynomial with integer coefficients 
which vanishes at designated points a or at least takes a very small value 
at them. In Liouville’s proof, a minimal polynomial of the algebraic 
number a is chosen, and in the general case, an elementary lemma 
formalized by Siegel is used (see Lemmas 6-4.5 and 6-4.6). 


2nd step. We use the fact that we can control the size of the coefficients 
of the polynomial F' to conclude that F is still very small at an algebraic 
point 8 close to a. To do this, we make use of a lemma by Schwarz 
(Lemma 6-4.10) or also Taylor’s formula (Lemma 6-4.7). 


3rd step. We prove that the polynomial F' must vanish at @ or that 
the height of G must be large. To do this, we use Liouville’s inequality, 
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which says, in its most rudimentary form, that a non-zero integer has 
absolute value > 1 (see Corollary 6-4.12). 


e 4th step. We prove a “zeros estimate” adapted to the situation, which 
allows us to control the location of the zeros of F’. If F € Z[X], we can 
in general settle for counting the zeros of F, but if F € Z[Xy,..., Xn], 
where n > 2, this step could prove to be very difficult. 


We will now present some lemmas which are useful for putting this method 


into action. 


4.5. Lemma. (Siegel’s Lemma I) Let N > M, and let the following be a 
system of linear equations with integer coefficients aj; not all equal to zero: 


ayitit+-::+ainty = =0 
Quivit:::tayuwnztn =0. 
Then there exists a nontrivial solution (21,...,2) € Z™ which satisfies 
M 


N—-M 
max |x;| < (7 malas) ; 
ij 


Proof. The proof is another application of the pigeonhole principle. Set 


ay, = max(0, a;j), a;; = max(0 — a,j) and L; := 7, |aij|, and observe that 


we can assume that L; > 1 (if not, the corresponding equation is trivial 
and can be omitted). Choose X := (es es Las) = | , and consider the 
map from [0, X]* to Z™ given by 

L(a1,...,0n) = (@11%1 +--+: + Gin en,...,4uiti +++: + aunty). 


We clearly have: 


-XS aj; < ati +++ +ayan < XY ah. 
j é] 


The number of values taken by L(x) is thus at most 


M M 
[[4 3545 4+ % 50540 =] (x2. +1). 
i=l j j i=l 


By the initial choice of X, we have the inequality (X¥+1)N~™” > L,---Ly, 


and hence 
M M 


(X+1)% >] [(XL,+ 4) >] [(XL+1). 


i=l i=1 
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Consequently, there exist two distinct elements 2’,2” € [0,X]AZ% such 
that L(x’) = L(a"’). The element x := 2’ — 2” is thus in Z \ {0} and 
satisfies L(x) = 0 and 


ae N—-M 
0 < max |x| < (Li--- Ly) N-™ < [ Nmax|aj,| 
i iJ 


By observing that log max |x;| is the (logarithmic) height of a solution, we 
can remember the upper bound (neglecting a term in log NV) as: 


height of a solution < (height of the equations) 


number of equations 


dimension of the solutions — 


We will state, without proof (see for example [38]), the version where the 
coefficients a;; are in a number field. The idea is the same: an equation 
with coefficients in K provides d := [K : Q] equations with coefficients 


in Q. 


4.6. Lemma. (Siegel’s Lemma II) Let K be a number field of degree 
d := [K : Q] and a; € K not all zero. Suppose that dM < N, and let 
A:= H(...,a@i;,...). Then the linear system 


Qi +-++-+a1ntn =0 
amit, +-:::+aumntn =0 
has a nontrivial solution (a1,...,2n) € Z™ such that 
dM 


max |x;| < (NA) N-4aM , 


The following lemma is a version of the following principle: if a polynomial 
vanishes with a large order at a point, then the polynomial takes small 
values in a neighborhood of this point if the coefficients of the polynomial 
are not too large. 


4.7. Lemma. (Application of Taylor’s formula) Let P € C[X1,...,Xm] 
be a polynomial of degree < D such that the absolute value of the coefficients 
is < ||P||. Suppose that P vanishes with order T at a = (a1,...,Qm)- If 
B= (fi,---;Bm) satisfies ja; — 3;| < ¢, then for |{] =i, +---+in <T, 


u oll m 4\D_T-|8l 
- - - - < (8™A)“e P 
Tl tml XE... Axim (8)| < (3A) ||P || 
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where A:= max{1, |ai|,...,|Qm|}- 
allp 
Proof. We write Taylor’s formula at a for Q := t——~—~—__- as: 
OX ++ OX im 
gli 
OU) SSeS 


erase Ee 
Perse ee) ee eo 


The product of the (a; — 3;) has absolute value < ¢?~!. By expanding 
the sum as factors, we have an upper bound given by a sum of the type 


Ld Thin 


as in the statement of the lemma. 


Wiliaiaa ||PI|AY F< 8"? AP ||P], 
iljl( 


8. Definition. Let a € D(0,1). The Blaschke factor associated to a is 
the function 
B,(z) = 2-2. (6.18) 


1-—az 


Some very simple properties of this factor are summarized in the following 
lemma, whose proof is left to the reader. 


4.9. Lemma. The Blaschke factor has the following properties. 


i) The function Ba(z) is holomorphic on the closed disk D(0,1) and has 
a unique simple zero atz=a. 

uw) If |\z| =1, then |Ba(z)|=1. In particular, ||Bal|i : supj.j;<,|Ba(2)| = 
1. 


iit) For z € D(0,1), we have the upper bound 


lel + lol 
1 Jall2| 


|Ba(z)| < 


A classical lemma of Schwarz (see for example [74]) says that a holomorphic 
function g(z) on D(0,1) such that g(0) = 0 and |g(z)| < 1 satisfies |g(z)| < 
|z| and ||g||, <7. The lemma stated below is a refinement of this. 


4.10. Lemma. (Schwarz lemma) Let f(z) be a function which is holo- 


morphic on the closed disk D(0,R). Suppose that f vanishes with order T 
at z=0, and we denote by | fll = sup,.j=, |f(2)| = supyz)<, |f(z)|.. Then 


gon 
rila> . 
ifle<(4) fle 
Slightly more generally, if f has zeros of order T; at z; € D(0,r0), with 
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yk = 7, them we have 


i oe T rT - 
Wale <1 (EY) nie < (A) ie (6.19) 


R? —r\z;| R? —rro 


i 
Proof. We will prove the second inequality, which is more general. We 


introduce the function g(z) := f(Rz), which is holomorphic on the closed 
disk D(0,1) and has zeros of order T; at a; := z;/R. Next, we set 


g(2) 


g (z) = Tl." B,.(oF : 


The function g*(z) is holomorphic on the closed disk, and on the circle 


|z| = 1, we have |g*(z)| = |g(z)| (by property ii) of Lemma 6-4.9). In 
particular, ||9*||1 = ||g|l1 = ||f||”. This implies the inequalities: 


T; 
Z| + [Oy © iiss 
eT (HL) ah 


\9(z)| = 
7 \ 1—Izlleal 


(2) J] Ba(” 


from which we can deduce that 


Ty 
r+1To 
fll = loll = < TT} “2 lie < (SE) ite 


; ies r|z4| R? —rro 


R?2 


4.11. Lemma. (Liouville’s Inequality) Let K be a number field of degree 
d and v a normalized absolute value. If a € K*, then 


la|, = H(a)~4 > Hx(a)7?. 


Proof. This follows easily from the construction of the Weil height and 
from the observation that H(a) = H(a~*). In fact, 


Hx(a) =Hx(a') = [I] max {1,Ja7!|,} > lal,", 
veMK 


and the desired inequality follows from this. 


This lemma is most often used to prove that an algebraic number with a 
controlled height is zero whenever its absolute value is sufficiently small, or 
that “an arithmetic quantity cannot be too small without being zero”. We 
will state an explicit corollary of this type. 


4.12. Corollary. Let a be an algebraic number in a number field K and 
let v be a place of K. If |a|y < Hx(a)~1, thena=0. 
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As a zeros estimate, we will start with an (easy) example in many variables 
and prove an elementary lemma which will help us in the proof of Thue’s 
theorem. 


4.13. Lemma. Let P © C[X),...,Xy] be a non-zero polynomial and S a 
finite set of complex numbers. If d:= deg P < |S|, then there exists x € S” 
such that P(x) # 0. 


Proof. If n = 1, then the number of roots is < d. We will reason 
by induction on n and write P = er P;(Xy,...,Xn—1) X43. Let T be 
the set of elements (71,...,%,—1) € S"~+ such that there exists j where 
P;(@1,-..,%n—1) # 0. By induction, we know that |T| > 1. The number of 
zeros in S$” is therefore < (|S"~1| — |T|)|S| + d|T| = ||" — |T|(|S| —d) < 
|S|”. 


4.14, Lemma. Let P € Z[X] be a non-zero polynomial and 3 := p/q € Q. 
Then the order of vanishing, ordg(P), satisfies 


m(P) é h(P) + log deg P 
log max(|p|, |a|) h(3) 


where m(P) := i log |P(exp(2zit)| dt < h(P) + log deg P. 


ordg P< 


Proof. We denote by r := ordg(P). By the hypotheses, we know that 
there exists Q € Z[X] such that P = (¢X — p)'Q. Since m(P,P2) = 
m(P,) + m(P2), we can deduce that 

r log max(|p|, |q|) < rm(pX — q) + m(Q) = m(P). 
The elementary inequality linking m(P) and h(P) is proven in Appendix A 
in the form m(P) < log |Pl2 < h(P) + + log(deg P + 1). 


Proof. (of Thue’s theorem (6-4.1)) We are considering an algebraic number 
a, which we can assume is an algebraic integer, and we let d := [Q(a) : Q]. 
We want to know whether there exist rational approximations which satisfy 


la = al <q75. (6.20) 


If the set of solutions of this inequality is infinite, we can assume that 
there exists a first solution, 6; := p;/q, where q, is very large, and a 
second solution 32 := p2/q2, where log q2/ log qi is very large. We also fix 
an € such that 0 < e < 1/4. The first step is to construct an auxiliary 
polynomial by choosing the following parameters (the proof will justify 
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this choice): 


_|lese |p». | arate 
logq |’ . 2 : 


Throughout the different steps in the proof, we will denote by C, Cy, Co, 
etc., constants which only depend on d and a. 


lst step. (Construction of an auxiliary polynomial) There exists a poly- 
nomial F(X,Y) = P(X) —YQ(X) € Z[X,Y] such that deg P,degQ < D 
and 


h 
a a (a,a) =O for0<A<T—-1, 


T 


and whose coefficients have absolute value < C',‘ 


The proof of this is a direct application of Siegel’s lemma (Lemmas 6-4.5 
and 6-4.6), where the number of free coefficients is 2(D +1), the number of 
equations (over Z) is dT and the height of the equations involves binomial 
coefficients. 


2nd step (Application of Taylor’s formula) Let 7 < T/2. Then we have 
the inequality: 


ip 
1 OF 6(T—3 a D+ 
HF axe 2 82)] S OO) a} Cp 


max {ar 


Using Taylor’s formula in one variable (cf. Lemma 6-4.7) and the hypothe- 
ses, we have 


OF (3) — pli)(3,) — QW) 
axd (8) = P¥’(B1) — B2QY’ (1) 
pith) (+h) 
=>) ee a)" ay 2 $9) (a, - a)! 
h>0 : h>0 


1 QithR h 
= ar a eR: = a) 
! 
ae h! @xith 


(j+h) 
— (f2- a) S° a =a)" 


h! 
h>0 


The first sum, divided by j!, is bounded above by C2q,°"||F||, while 
the second is bounded above by C?qz°||Q||. By using the estimate ob- 
T 


tained in the Ist step: ||F|| = max(||Pl|,||Q||) < C,© , we obtain the 
desired upper bound. 
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3rd step. (Extrapolation by Liouville’s inequality) We have OF (31, 82) = 


axi 
5 J 1 a C3 
0 for0<j<Tx ¢log an 
(6+ 1) 
1 OF 


Let 7 be such that — - 
j! Ox) 


with denominator ge q2, its absolute value is larger than qo da A (cf. 
Lemma 6-4.11). By combining this with the previous step, we obtain 


(31, 82) #0. Then since it is a rational number 


T 
—D+j — —6(T-j) 5 D+— 
qi +g? < max {45 ( D5 lo; on 


By our choice of T, we have gt > q2 > gq}. We can then deduce from 
] 

this that -D+j —-T—1< —6(T—j)+(D+T/e) = oe By our choice 
08 71 


dT(1+ 6) 
2 


of D, we have D < < D+1, hence the inequality 


d de C3 : 
< ‘ 
r(3 5 1 5 ==_) <9(6 +1) 


4th step. (Zeros estimate) There exists 7 < C4T/elog q, such that 


1 OF 
ji Axi * 


= 


We introduce the Wronskian W(X) := P’(X)Q(X)—P(X)Q’(X). Observe 
that W is not identically zero, for if so, P and Q would be proportional 
and hence divisible by (X — a)? and likewise by P,(X)7 (where P, is 
the minimal polynomial of a). We know dT > D, and therefore, P = 0. 
Lemma 6-4.14 then allows us to prove that 


LT . 
elog qi 


ordg, W<% 


The conclusion is now clear: we obtain a contradiction if q, is too large 
and 6 > ¢ + 1+ 2e, which finishes the proof of Thue’s theorem. 


While technically more elaborate, the proof of Roth’s theorem relies essen- 
tially on the same ingredients: we construct a polynomial, PE Z[X1,..., Xm] 
which vanishes with large order at (a,...,q@), and the zeros estimate (the 
most difficult part) also relies on the use of Wronskians. The result ob- 
tained is by its nature not computationally effective in the two cases since 
the bounds obtained depend on the size of solutions of inequalities whose 
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existence is not known and whose nonexistence is practically denied by the 
conclusion of the theorem. 


The following statement is the promised computationally effective version 
of Baker’s theorem on linear forms of logarithms (Theorem 6-4.3). 


4.15. Theorem. (Baker [1]) Let aj,...,an be non-zero algebraic num- 
bers. There exists C > 0, which can be computed and only depends on d, n 
and the a;, such that for every Bo,..., 8, which are algebraic and of degree 
at most d and height at most B, we have 


|Bo + Br logoaa +++: + Bnlogan| > B®, (6.21) 
whenever the quantity is non-zero. 


The most widely used version of this theorem is the following (see also [1] 
for the proof). 


4.16. Corollary. Let a,,...,@, be non-zero algebraic numbers. Then 
there exists C > 0, which can be computed and only depends on n and the 
a;, such that for all b,...,b,, integers with absolute value at most B, we 
have 

pe — 1S BS, (6.22) 


whenever the quantity is non-zero. 


The corollary can be deduced, of course, from an inequality of the type 
|exp(z) — 1| > C|z|. The assertion is given with the Archimedean absolute 
value, but it remains valid for a p-adic absolute value and is moreover 
proved in a similar manner (by conveniently defining p-adic logarithms). 
By assuming that the constant C = C(n,d,aj,...,Q@,) exists and can be 
computed, we can see how this theorem allows us to explicitly find a solution 
to the S-unit equation. 


Proof. (that Corollary 6-4.16 implies Theorem 5-4.4) For convenience sake, 
we let S' be the set of Archimedean places of the field K, plus a finite set of 
finite places and r := |S|—1. Consider the embedding L : 6% ¢/uxK @ R!*! 
given by L(a) := (logla|y)yes. Let €1,...,€ be a basis for the S-units 
modulo roots of unity, in other words, every element u € Oj ¢ can be 
written uniquely as 


my Mr 


u=Cey'---e.'" where ¢ € ux and m, € Z. 


We can define two norms on the lattice L(O% ¢): the norm induced by the 


sup-norm on R!*! and the norm M(u) = max; |m,|. Since these two norms 
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are comparable, we obtain a constant c; such that for every u € On, gy we 
have 
cy M(u) < max [log |ulu| < cM (u). 


Since we know that )7,,<glog|uly = 0 (for u € Gj 5), we can see that 
maxyegs log |uly < Maxyeg |log |ul.| < (|S| — 1) maxzes log |u|,, and we can 
therefore conclude that there exists cy such that 


cy M(u) < max log luly < coM(u). (6.23) 


We will now arrive at a solution u,v € Ox ¢ to the equation u+ v = 1, 


where v = ¢ "em tee a On the one hand, by the previous considerations, 
we obtain: 
min|—+ —1|, = min |4| = exp (— log maxx |u ) < exp (- oe ) 
U v U ly a Ly C2 ; 

(6.24) 
On the other hand, a direct application of the corollary to Baker’s theorem 
(6-4.16) provides the existence of a constant c dependent on the field K, 
on the set S and on the fundamental units €),...,¢€, such that we have the 
inequality 


max{M(u), M(v)}~° < min alae Nelle — gen = 


(6.25) 


Up to reversing the roles of u and v, we can suppose that M(v) < M(u) 
M 
and deduce from (6.24) and (6.25) that M(u)~° < exp (- Cu) ). which 


2 
clearly bounds M(w) and therefore leaves only a finite number of possibil- 
ities for u and hence also for v. 


4.17. Remark. We point out that to prove the finiteness of the set of 
solutions of the S-unit equation, it suffices to have a weaker version of 
Baker’s theorem of the form: 


lr log ay + +++ + my logan| > exp (—W(M)) with (M) = o(M), 


where M := max|m;,|. In particular, Liouville’s inequality would only give 
|m,loga; +---+m, log a,| > exp (—cM) and is therefore simply insuffi- 
cient! Finally, it is clear that Baker’s argument is computationally effective, 
under the condition that the function ~(M) is given explicitly, even if it 
often leads to very large bounds. 


We will finish this chapter, a good part of it dedicated to problems of effec- 
tiveness (in a computational sense), by pointing out that it is unrealistic to 
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expect to find computationally effective solutions to all of the Diophantine 
problems. The famous proof concerning Hilbert’s 10 problem by Matija- 
sevic (see [24]) proves that we will never find a universal algorithm which 
decides whether a Diophantine equation has an integer solution. Neverthe- 
less, we can hope that the problem of determining the finite set of integer 
points on a curve (Siegel’s theorem) or even the finite set of rational points 
on a curve (Faltings’s theorem) has a computationally effective solution. 
For integral (or S-integral) points, the problem is solved for curves of genus 
0 and 1 by Baker. Baker’s method can also be applied to curves of genus 2 
which can written as y? = P(x) (where deg P = 5 or 6), but not in general 
to curves of genus 2 of the following type: 


y+ fa(z,y) + fola,y) = 0 (6.26) 


where the f; are homogeneous of degree 7. 


5. The a, b, c Conjecture 


This section is a little peculiar, since it discusses the consequences of a con- 
jecture, which has not yet been proven and whose formulation was presented 
in the 1980’s by Masser and Oesterlé. Moreover, all of the assertions in this 
section are conditional upon it. Nevertheless, the elementary character of 
the a,b,c conjecture and the depth of its implications make it a very active 
subject of investigation and experimentation. A surprise is provided by the 
“dictionary” between such elementary statements and the theory of elliptic 
curves. To deepen your understanding of the subject and its connections, 
we recommend the presentation of Oesterlé in séminaire Bourbaki [56]. 


We will begin by proving the following easy theorem. 


5.1. Theorem. Let A,B,C be non-constant polynomials which are rela- 
tively prime to each other and such that A+ B+C=0. Then 


max{deg(A), deg(B), deg(C)} < ro( ABC) — 1, (6.27) 
where ro(P) denotes the number of distinct zeros of P. 


Proof. We first write the factorizations of A, B,C: 
r s t 
A=al][@-a)*, B=] [@-4)™, C=c] [@-w”™. 
i=1 i=1 i=l 


A B 


We then introduce the determinant given by A = det Al Bl We 


can easily see that A = — det (a o = det € a and thus that 
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Tl (T — 0;)%71 divides A and likewise []?_,(T— 6;)"™71 and [[_,(T — 
7;)™—+. Suppose, for example, that deg(C) is the largest of the degrees. 
Then we have: 


(deg(A) —r) + (deg(B) — s)+ (deg(C’) —t) < deg(A) < deg(A) +deg(B) —1, 


hence deg(C’) <r +s+t-—1, which is what we wanted to prove. 


The following conjecture is suggested by analogy. 


5.2. Conjecture. (Masser-Oesterlé) Let « > 0. There exists a constant 
C. such that if a,b,c are relatively prime integers which satisfy the equation 
a+b+c=0, then 
1l+e 
max {la|,|d|,le}<Ce| [[ rp} - (6.28) 


p|abe 


If we introduce the notation Rad(n) := [[,, InP (resp. rad(n) = log Rad(n)), 


we can rewrite the previous inequality in the form 


h(a, b,c) < (1+ €) rad(abc) + Cy. 


We are going to see that this apparently innocent assertion—christened “the 
a, b, c conjecture’”—has surprisingly deep consequences. A stronger form— 
christened “the effective a,b,c conjecture’—requires that the constant C;, 
be computable (in terms of €). 


5.3. Remark. Let S be a finite set of prime numbers, and let u,v € Zs be 
S-unit solutions to the equation u+v = 1. If we reduce the expressions to 
relatively prime integers, u = a/c, v = b/c, and apply the a, b, c conjecture 
to the equation a + b = c, we obtain 


max {h(u),h(v)} < (Lte) S_ logp+C. 
pes 
By reversing the argument, we see that we can reformulate the a, b,c con- 
jecture as a uniform version of the bound on the heights of solutions of the 
S-unit equation. 


5.4. Proposition. Assume that Conjecture 6-5.2 is true. Let £,m,n > 2 
be integers such that €-1+m~!+n~! <1. Furthermore, let S be a finite set 
of prime numbers, and let u,v and w be S-unit integers. Then the number 
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of solutions to 
uX? +vY™ + wZ" =0, (6.29) 


where X,Y,Z are relatively prime, is finite and bounded uniquely in terms 


of S. 
Proof. By applying the a, b,c conjecture to (6.29), we obtain 
max (|X|*,|¥|", |Z|") < C.Rad(wwwX*Y" 2)" < C5 |XYZ|"**, 
from which we can easily deduce 
byz2Capyge ero, 


The last inequality is clearly a bound on the integer |X Y Z| whenever (~!+ 
m ttn <1. 


5.5. Remarks. In the remaining cases, i.e., up to permutation (€,m,n) = 
(2,2,m), (2,3,3), (2,3,4), (2,3,5), (2,3,6), (2,4,4) or (3,3,3), it can be 
shown that there are, at least for certain u,v and w, infinitely many integer 
solutions. This statement gives a proof—modulo the a, b,c conjecture—of 
Faltings’s theorem (Theorem 6-2.6) for Fermat curves given by the homo- 
geneous equation (for m > 4): 


uxX™ +v0Y™"+wZ™ = 0. 


We can considerably strengthen these statements, still modulo the a,b,c 
conjecture. First observe that we can reformulate the a,b,c conjecture by 
expressing it in terms of relatively prime a and 6, thus forgetting c, in the 
form 


max{|al, |b]}'~© < C, Rad(ab(a + b)). 


The generalization that we have in mind is the following. 


5.6. Proposition. Suppose that the a,b,c conjecture is true. 


i) Let F € Z[|X,Y] be homogeneous of degree d with no multiple factors 
and e > 0. Then there exists a constant Cp. > 0 such that for all 
relatively prime integers a and b, we have: 

max{|a|, |b|}“-?~* < Cr.< Rad(F(a,b)). 
it) Let f € Z[X] be of degree d with no multiple factors and « > 0. Then 


there exists a constant C'f,< > 0 such that for every integer a, we have: 


Ja|*-1~* < Cpe Rad(f(a)). 
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To prove the proposition, we can rely on (a particular case of) a the- 
orem of Belyi and the Riemann-Hurwitz formula, which are stated be- 
low. A morphism ¢ of degree d from P! to P! is given by two homoge- 
neous polynomials, A and B, of degree d and which are relatively prime: 
(x9, 21) — (A(a0, 21), B(xo,21)). For almost all points x = (z9,2 1) € P?, 
the cardinality of ¢~'{} is constant, equal to d. The morphism ¢ is ram- 
ified above x precisely when |¢~'{x}| < d. For the proof of Formula 6-5.8, 
we refer you to [35] or [88]. 


5.7. Proposition. (Belyi) Let S be a finite subset of P'(Q). There exists 
a finite morphism ¢: P! — P! such that 

i) $ is unramified over P* \ {0,1, co}, 

it) @(S) C {0,1, co}. 


5.8. Proposition. (Riemann-Hurwitz formula for P') Let ¢6: P! — P! 
be of degree d. Then |\@~!{x}| =d for almost all points x of P+, and 


2d—-2= S~ (d—|o"{2}}) . 
zeEP! 

Proof. (of Proposition 6-5.7) Let d be the degree of the field generated by an 
irrational point of S. By applying the minimal polynomial which vanishes 
at this point, we send it to 0 € P!, and the new ramification points are 
now defined over a field of degree < d. By iterating this procedure, we can 
reduce to assuming that S Cc P!(Q). We finish the proof by repeatedly 
using morphisms of the type: 


o(z) = aT es road Ome ama 


Any such morphism is unramified over P!\{0, 1,00} and sends {0, 1, 00, ral 
to {0,1, oo}. 


Proof. (of Proposition 6-5.6) First observe that statement ii) can be de- 
duced from statement i) applied to the polynomial F(X, Y) = Y¢+! f(X/Y). 
To prove statement i), we take S C P1'(Q), the set of zeros of F(X,Y), 
and we find a Belyi map (by Proposition 6-5.7), 6: P! — P?, given by 
two polynomials, A(X,Y) and B(X,Y), of degree 6. We set C(X,Y) = 
A(X, Y) — B(X,Y). The Riemann-Hurwitz formula can therefore be writ- 
ten 


6+2=|$ *{0, 1, 00}|. 


We point out that, for 2 € P', we have 
1) A(x) = 0 if and only if @(2) = ~, 
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2) B(x) = 0 if and only if d(x) = 0, 
3) C(a) = (A — B)(x) = 0 if and only if d(x) = 1. 


Thus, if we set D(X,Y) = ABC(X,Y) and Do(X,Y) = Rad D(X, Y), we 
have the inclusion S C {zeros of D} = {zeros of Do}, and thus, since F is 
square-free, Do = FG. Likewise, we have deg(Do) = |@~'{0, 1, co} | = 6+2, 
hence deg(G) = 6+2-—d. Moreover, A(X,Y) and B(X,Y) are rel- 
atively prime. Then there exists an integer R (essentially a resultant) 
and polynomials with integer coefficients such that A(X,Y)U(X,Y) + 
B(X,Y)V(X,Y) = RX™ and A(X,Y)U'(X,Y) + B(X,Y)V"(X,Y) = 
RY’. If now a and b are relatively prime integers, we can deduce that 
e := gcd(A(a, b), B(a, b)) divides R and hence is bounded independently of 


a and b. We therefore apply the statement of the a,b,c conjecture to the 
A(a, b) B(a,b) _ C(a,b) 


triple . 3 a We then obtain 
A(a,b B(a,b C(a, b 
max . ) ; Mas ) , a ) } < C.Rad(D(a,b))'**. 


We can easily see that max(|A(a, b)|,|B(a,b)|) > C max(|a|,|b|)° and, on 
the other hand, that Rad(D(a, b)) = Rad(Do(a, b)) and 


Rad(Do(a,b)) < Rad(F(a, b))|G(a, b)| < C Rad(F(a, b)) max({al, |b])°t?-4. 


By combining the obtained inequalities and by simplifying by max(|al, |b|)°, 
we get exactly the desired assertion. 


5.9. Remark. Proposition 6-5.6 allows us to show that if we assume the 
a,b,c conjecture, then the set of rational points on the projective curve 
given by the homogeneous equation 


F(X,Y) =mzZz?4 


has a finite number of rational points whenever d := deg(F’) > 4, which 
as — Zz > 2. Elkies [29] extended this argu- 
ment by using Belyi’s results and proved that the a,b,c conjecture allows 
us to recover Faltings’s theorem for every curve. Thus, in particular, a 
computationally effective solution to the a,b,c conjecture would allow us 
to effectively compute the rational points on curves of genus > 2. 


corresponds to g = 


We will now prove that the a, b,c conjecture can be formulated in terms of 
elliptic curves (see Chap. 5 for notations and notions). 


5.10. Conjecture. (Szpiro) Let « > 0. There exists a constant C, 
such that for every elliptic curve E/Q with minimal discriminant Ag and 
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conductor Ng, we have 


[Agi <CNet, (6.30) 
We can formulate a slightly stronger variation. 


5.11. Conjecture. (Frey-Szpiro) Let « > 0. There exists a constant C, 
such that for every elliptic curve E/Q with minimal discriminant Ag and 
conductor Nr, we have 


max {H(jz),|Az|} < C.Np*, (6.31) 
where H(jm) designates the height of the invariant jr. 


Since jg = c}/Ag and 1728Ag = c} — c2, we can replace the left-hand 
side of the inequality by max(|c4|°, |Ag|) or max(|c4|%, |cg|?, |Ag]), up to 
modifying the constant C%. 


5.12. Proposition. The a,b,c conjecture is equivalent to the following 
assertion: for every positive € > 0, there exists a constant C > 0 such that 
for every elliptic curve E defined over Q, we have 


max (|Az|, [cf], |cl) < Ce (Nz)°**. (6.32) 


Thus, the a,b,c conjecture and the Frey-Szpiro conjecture are equivalent. 


Proof. We will first show that the a,b,c conjecture implies the Frey- 
Szpiro conjecture. We assume the inequality c} — c? = 1728A, and we 
set d = gcd(c},c2) and R = Rad(c}cZ1728A/d?). We also denote by [2] 
the smallest integer which is an upper bound for the real number x. If we 
factor d = [J}_, p;’, then pir! 31 divides c3 (resp. p; I'/2] divides c2), and 
we can write: 


3 2 
C4 C6 1728A 3[r;/3|+2[r;]-2r; 
DP; P; Py 
C4 C6 N 
< Rad ¢ 6 
el" Yer" / Te: 
6|cace N| 


< [r:/3]-4[ri/2)+1 
Tie Ty 


The second to last inequality is true since, on the one hand, Rad(A) = 
Rad(N) and, on the other hand, if 4 2,3 divides d, then there is additive 
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reduction and (? divides N, thus £ indeed appears in N/ J], p;. Therefore, 
we use the a, b,c conjecture with a = c}/d, b = c2/d and c = 1728A/d, and 
we obtain 


QTE 1te 
max (\cl],|e2l,|AN) < Ce {leaceMIT] vr}, 


where we let a(r) := r—[r/3]—[r/2]—1. Observe that a(r) < 0 for r < 10 
(whereas a(12) = 1), and we use the following elementary computation: if 
divides d, then the order of d at ¢ is at most 10. In fact, if €+ divided c4 
and ¢° divided cg, the model that we started with would not be minimal, 
contrary to the hypotheses. Thus either ordg(c4) < 3 or orde(cg) < 5. We 
can therefore conclude that 


max (|e4|, |c6|,|Al) < CeleaceN|'**. 


We will allow ourselves from now on to denote by e and C, the succes- 
sive constants (a priori different). We first obtain the inequalities |c3| < 
C.leeN|**© and |cg| < C.|caN|***, which imply the inequalities |c4| < 
C.N?*¢ and |cg| < C.N°**, which yields the first implication of the propo- 
sition. 

For the converse, let a,b,c satisfy a+ b+c = 0 and gcd(a,b,c) = 1. We 
consider the associated Frey-Hellegouarch curve: y? = «(a —a)(a+ 6), and 
we can easily compute that 


(a? +. ab+ b7)3 


. 98 
4 (abc)? 


and A = 2*(abc)?. 


The Frey-Szpiro conjecture applied to this curve can therefore be written 
as 
log max (|a? + ab + b7|°, |abe|?) < (6 + €) log(Rad(abe)) + Ce, 


which of course implies that log max (|a], |b], |c]) < (1 + €) log(Rad(abc)) + 
Ce 


5.13. Remark. We can also prove that Szpiro’s conjecture implies the 
a,b,c conjecture with exponent 6/5. To see this, let a + b = c where a,b 
and c > 0 and a > b. Consider the elliptic curve E : y? = x23 — 2(a— 
b)a? + (a+ )?x. It is a curve isogenous to the Frey-Hellegouarch curve 
y? = x(x + a)(x — b) (in other words, there exists an isogeny, which is a 
surjective, algebraic homomorphism with a finite kernel between the two 
curves). The isogeny with kernel equal to the group of order 2 generated 
by P = (0,0) is given by the formulas: (x,y) + (y?/x?, —y(ab + x?) /x?). 
We can check that the model of £ is minimal and even semi-stable, except 
perhaps in 2 (the curve is semi-stable in 2 if and only if 2+ divides abc) and 
that the discriminant of the model equals D = —2°abc*. Szpiro’s conjecture 
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therefore implies that 
a® < abet < C.R(abc)°**, 


which gives the desired inequality. 


5.14. Remark. Diverse approaches have been proposed to prove the 
a,b,c conjecture. Philippon proposed an approach based on very strong 
lower bounds of linear forms of logarithms. Today, because of Wiles’s 
theorem [80] (see Theorem 6-6.14, for the statement and this same section 
for the definition of Xo(NV)) which guarantees the existence of a modular 
parametrization 

bn: Xo(Nz) — £, 


the most exciting approach is to try to bound the degree of the parametriza- 
tion dg. In particular, the following conjecture implies the a, b, c conjecture 
(see [38] and [54]). 


5.15. Conjecture. (Degree conjecture) For every « > 0, there exists a 
constant C, such that for every elliptic curve E defined over Q, there exists 
a modular parametrization dg : Xo(Ng) > E which satisfies 


deg(¢z) < CN. 


6. Some Remarkable Dirichlet Series 


We have already encountered many Dirichlet series. In this section, we 
will introduce various generalizations of them. We will succinctly describe 
the connections—proven and conjectural—between some of these “C” or “L” 
functions: the series associated to modular forms and their generalizations, 
the series associated to (families of) Galois representations and finally the 
Hasse-Weil series associated to algebraic varieties. This leads us to the 
border of the “automorphic world”. To go further, we recommend consulting 


[27], [32], [68], as well as [20], [26], [66] and [65]. 


We will start by stating a generalization from the Riemann zeta function to 
the Dedekind zeta function of the analytic continuation with a functional 
equation (Theorem 4-5.6). 


To do this, it will be convenient to introduce some small modifications to 
the Gamma function. 


Notation. We denote the modifications to the Gamma function as follows: 


Tr(s) := 9 ??T (4) and Te(s) := (27) -*T (s). (6.33) 


6.1. Definition. Let K be a number field, r; (resp. rz) the number of 
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real (resp. complex) embeddings. Let r = 71 +r2—1, and choose €1,..., é; 
to be a basis for Oj, /wx. The regulator of units of a number field is defined 
as the absolute value of any r x r determinant taken from the r x (r+ 1) 
matrix of coefficients log |e;|, for 1 < i < r, where v is an Archimedean 
place. 


6.2. Theorem. (Hecke) Let K be a number field containing wx roots 
of unity and with r, real embeddings, r2 pairs of complex embeddings, dis- 
criminant Ax, class number hk and regulator of units Rk. The function 
Cx(s), initially defined for Re(s) > 1, can be analytically continued to the 
whole complex plane, except for a simple pole at s = 1 with residue: 


2b (27)? Rehx 


lim(s ~ 1)Gxe(s) = (6.34) 


WK |Ax| 
Furthermore, if we let 

Ex(s) := |Ax|*T'r(s)"P'e(s)"*Cx(s), 
then we can write the functional equation of ¢x(s) in the form 


Ex(s) =€x(1— s). (6.35) 


Finally, €x(s) is bounded in every vertical strip (outside of a neighborhood 
of 0 and 1). 


We should also point out that ¢x(s) 4 0 for Re(s) > 1 (look at the Euler 
product), and hence €«(s) 4 0 as well. Because of the functional equation, 
we also have that €x%(s) # 0 for Re(s) < 0. By observing that I'(s) has a 
simple pole at the negative integers, we can deduce that, in the half-plane 
Re(s) < 0, the function ¢x(s) only vanishes at negative integers, with order 
rg at odd negative integers and order r; + rz at even negative integers, the 
order at zero being r; + rg — 1. 


Let x be a primitive Dirichlet character modulo N > 2 (see Exercise 
1-6.12). We set « = 0 (resp. € = +1) if x(—1) = 1 (resp. if x(-1) = —1), 
and 

A(x, 8) = N*/?TR(s + ©) L(x, 8). 


The function L(x,s) can therefore be continued to an entire function and 
satisfies the functional equation (where wy, is a complex number of absolute 
value 1): 

A(x, 8) = wy A(X, 1—'s). (6.36) 


Furthermore, A(y,s) is bounded in every vertical strip. 
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We have seen in Chap. 4 that if K = Q(Vd), then there exists a character 
Xa modulo Ax such that ¢x(s) = ¢(s)L(xa, 8). From this, we can deduce 
the following formulas. 


i) If K is real quadratic and € > 1 is its fundamental unit, then 
2h log € 
VIAK| 
ii) If K is imaginary quadratic (other than Q(7) and Q(j), for which wx = 
4 and 6, respectively), then 


L(xa, 1) = 


The 
VIAKI 


By adding these formulas to those proven for L(y, 1) in Exercise 4-6.6, we 
can infer some interesting properties concerning hx and loge from them. 
In the case where we take K = Q(exp(277/¢) (for an odd prime @), we have 
essentially proven, during the proof of Lemma 4-4.18, the formula 


CK(s) = C(s) [[£o 8), 


L(xa,1) = 


where the product is taken over the nontrivial Dirichlet characters modulo 
£. We can, of course, also deduce the formula: 


an l-1 
(Qn)??? Rehe 
opet/2 IT 20,0. 
j=l 


The Artin L-functions associated to a representation p, which are defined in 
Appendix C, provide another example. They satisfy a functional equation 
of the same type, but we do not in general know whether the meromorphic 
continuation is in fact holomorphic. 


Modular forms. We now define some other Dirichlet series coming from a 
world apparently far away from the previous ones, namely the automorphic 
world. It gives us an opportunity to briefly introduce modular functions and 
curves to which we have already alluded. 


6.3. Definition. The Poincaré half-plane is # := {z € C | Im(z) > 0}, 
and the extended Poincaré half-plane is #* := #2UP+(Q). 


The group GL} (R) of 2 x 2 matrices with positive determinant, as well as 
the group SL2(R) of matrices with determinant 1, acts on # by the action 
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(¢ a) ,z)  (az+b)/(cz+d). The groups GL3(Q) and SL(2, Z) act 


on #*. The action of the latter group is discrete and we can therefore form 
the quotients Y := SL(2,Z)\# and X := SL(2,Z)\#* and endow them 
with the structure of a Riemann surface (see |68]). In fact, Y & A'(C) and 
X2Pl(C). 

The group SL(2, Z) and its finite index subgroups play an important role 
in arithmetical questions. In what follows, we will introduce a whole family 
of such subgroups. 


6.4. Definition. A subgroup T Cc SL(2,Z) is a congruence subgroup if it 
contains (NV) for a certain N where 


T(N) = {4= é € SL(2,Z) | A= Tmoan}. 


We denote by Y(N) :=T(N)\# and X(N) :=T(N)\#*. 


Besides ['(V) itself, two other congruence subgroups deserve to be men- 
tioned. 


i) The congruence group 


Ti(N) := {4= é . €SL(2,Z)|A= co +) moa wv} 


where Y;(N) :=Ty(N)\# and X,(N) :=T1(N)\%*; 
ii) The congruence group 


TaN {A= ie 4 € §L(2,Z) |c= omoa wv} 
where Yo(N) :=To(N)\# and Xo(N) :=To(N)\ 4%. 


We can easily see that [;(N) is normal in T9(N) and that [o(N)/T1(N) = 


(Z/NZ)* by the map (: ;) ++ dmod N. 
In can be shown (see [27] or [68]) that Yo(N) (resp. YiCV), Y(NV)) are 
affine algebraic curves, whereas X9(V) (resp. X1(NV), X(N)) are projective 
algebraic curves. Furthermore, Xo(N) and X,(NV) are defined over Q, 
whereas X(N) is defined over Q(exp(27i/N)). 


6.5. Definition. Let I be a congruence subgroup. A modular form of 
weight & with respect to I’ is a holomorphic function f : # — C such that 
the following properties hold. 
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i) For 7) €T and z€ #, we have 


az+b 
f ( omar ) = (cz + d) f(z). (6.37) 


ii) The function f is holomorphic on #*, in other words, for every 
(: 4) € SL(2,Z), the limit of f (2+) (cz +d) as Imz tends 
to infinity exists. If this limit is always zero, f is said to be a cusp form 
or parabolic. 


We denote by M;(I) the vector space of these modular forms and 5S;,(T) 
the subspace of cusp forms. We are talking essentially about modular forms 
for To(NV), but we can introduce a variation of them, with the help of a 
Dirichlet character, x, modulo N. A function f is called a modular form 
for [o(N) twisted by x if it is modular when the (6.37) is replaced by 


a b az+b\ | k 
v(2 Gero, £(2E4)=xalersatre). (638) 
We denote by Mz(N,x) (resp. 5;,(N,x)) the space of these forms (resp. 
cusp forms). It can be shown that S;,(Ti(N)) = @,5z(N, yx). 


6.6. Remarks. 1) Observe that, since —Id € T9(N), an element f of 
M,(N,x) must satisfy f(z) = x(-1)(-1)* f(z). Thus Mz(N,x) = {0}, 


except maybe if y(—1) = (—1)*. 


2) Every congruence subgroup [ contains an element T), := € ') with 


h non-zero and minimal: for example, T; € T;(V). Therefore, every f € 
M,(T) satisfies f(z +h) = f(z), which allows us to write its Fourier series 
expansion as: 


= n ae 271z 
f(Z= d, anth ; where gn := exp (2a2 ) : (6.39) 


Moreover, the condition of being holomorphic on #* imposes that a, = 0 
for n < 0, whereas its vanishing at oo is written a, = 0 for n < 0 (n.b. 
this is a necessary condition, but for f to be a form, holomorphy must be 
tested at all points in P'(Q) = #*\ #). 


3) Ify= (: ? € GL2(R)t, therefore 6 := ad — bc > 0, and if we let 


yf = 6-24, then y/ € SL2(R) and 7/-z = -z. Therefore, if f is modular 
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of weight k for 7’, then 


7 (2+) = 6*/? (ez + d)* f(z). (6.40) 


6.7. Definition. Let f €S,(Io(N)), and let f(z) = S-—, dn exp(27inz) = 
yr G@ng” be its Fourier expansion. The Dirichlet series associated to f 


is defined by: 
= aan. (6.41) 
n=1 


We point out that we have the relation (called the “Mellin transform”): 


T'o(s)L(f, 8) = (20) 7ST (s) j= f° f (it)t**dt. 


6.8. Definition. Let f = >>, an(f)q” € Mg(To(N)). Hecke operators are 
defined as follows. 
1) If p does not divide N, we define the operator f +> T,f by: 


an(Tp f) = Anp(f) + pet an /p(f) 


where, by convention, a,,/, = 0 if p does not divide n. 
2) If p divides N, we define the operator f ++ U,f by: 


an(Upf) = Din Fs 


A small generalization which is often useful consists of defining the T;, on 
all of M;,(T1(V)). This can be done by setting, for f € Mz (N, x): 


an(Tpf) = Anp(f) + x(p)pe an jp(f). 


Note that, since y(p) = 0 when p divides N, we can consider the previous 
formula to also define U, when p divides N. 


6.9. Theorem. (Hecke, see [68]) Hecke operators commute with each 
other. If f = ¥0,,an(f)q” € Sp(To(N)) is an eigenvalue for each of the 
Hecke operators, t.e., Tp f = A,f and U,f = Apf, then a,(f) = Apai(f), 
and if we normalize f by the condition a,(f) =1, the function L(s, f) can 
be factored as an Euler product in the form: 


= [[ G-ap(f)p-*)"* [] G—-ap(f)p* + p****)-*. (6.42) 
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Therefore, we see the appearance of the Euler product, which, for k = 2, 
much resembles the L-function associated to an elliptic curve. To underline 
this resemblance, we will see how, under certain conditions, L(s, f) satisfies 
a functional equation. 


Observe that the matrix Wy := ( ae ), which is not in SL(2,Z) but 


-N 0 
in GL3(Q), nevertheless normalizes the subgroup I'9(N) because 


a b -1_ d —c/N 
wy (2 a) Wa = (bw a ). 
We can deduce from this that Wy acts on M;,(T'9(N)) (resp. S,(To(V))) 


and since Wz, = —NId, we see that the spaces Mz(T(N)) (resp. S2(To(NV))) 
can be decomposed into the sum of two eigenspaces in which: 


io) astra siaswrene)——(6 


This remark can be used as a motivation for the following assertion. 


6.10. Theorem. (Hecke) Let ¢ = +1, and let f(T) = D735, 4n exp(27inr) 
be a modular cusp form for _ of weight k such that 


f(-ze = eN*/25* f(z), (6.44) 


Let A(s, f) := ed where L(s,f) := >? 


nai ann °. Then 
the function A(s, f) can be analytically continued to the complex plane and 
satisfies the functional equation 


A(s, f) = i*eA(k — s, f). (6.45) 
Furthermore, A(s, f) is bounded in every vertical strip. 


Proof. We first note that for tT = it (where t € R,), (6.44) is written 


f (se) = When Fit). 


We can therefore perform the following computation by using the variable 
change t> 1/Nt: 


A(s, f) = N8/? | 7 f (itt? 1dt 
0 


1 
= N82 / VN f(it)t® dt + N8/? / _ | F(ityte tat 
: VN 


=i kena & a F slay sae + el? f _ F(ityt "dt. 


VN VN 
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We can easily see that the latter expression defines an entire function, 
because by showing first that |a,| = O(n°), we can see that |f(it)| = 
O(exp(—2zt)) as t tends to infinity. Furthermore, the expression is clearly 
(i*e)-symmetric when we change s to k — s. The last assertion of the 
theorem is clear by the expression for A(s, f) as an integral. 


We will end this introduction to modular forms by indicating several con- 
nections between modular forms and Galois representations. 


First of all, we know how to associate, thanks to Deligne and Deligne-Serre 
(see [26]), Galois representations to modular forms. 


6.11. Theorem. (Deligne) Let ¢ be a prime number and f = Y°,, ang” € 
M;(N,x) a modular Hecke eigenform which is normalized (i.e., a, = 1). 
We know that the field K generated by the values of x and the an is a 
number field. Then there exists an ¢-adic representation (with coefficients 
in a completion Ky, an extension of Qe), 


p: Ge — GLa(k,), 
which satisfies the following properties. 
i) p is unramified outside of Ne. 
ti) For p not dividing N¢, we have the formulas 
Tr p (Frob,) = ap and det p (Frob,) = x(p)p*". (6.46) 


Moreover, if f € Sy(N, x), then the representation is irreducible. 


We can, of course, deduce some representations p modulo ¢ from this by 
composing with @, — Fy,s:. In the fairly special case of forms of weight 1 
(ie. k = 1), Deligne and Serre proved that we can lift these representations 
to characteristic zero and thus obtain Artin representations. Recall also 
that if f £0 is in M,(N, xy), then y(—1) = -1. 


6.12. Theorem. (Deligne-Serre) Let f = D0, ang” € Si(N,x) be a 
modular Hecke eigenform which is normalized (i.e., a, = 1). Then there 
exists an Artin representation, 


py Gq — GLa(C), 
which satisfies the following properties. 


i) pr is unramified outside of NE. 
ii) For p not dividing Né, we have the formulas 


Tr py (Frobp) = ap and det py (Frob,) = x(p). (6.47) 
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Furthermore, the representation pr is irreducible and odd (i.e., if ¢ denotes 
complex conjugation, then det ps (c) = —1). 


Let us point out that the representation pf is continuous; this would not, 
of course, be the case if we had simply embedded K, into C and thus 
obtained a representation Ga — GLo(Ky) > GLa(C). 


6.13. Conjecture. Every Artin representation of dimension 2 which is 
irreducible and odd is associated to a modular form of weight 1. 


L-functions associated to algebraic varieties. 


We will now return to the Hasse-Weil zeta function associated to an al- 
gebraic variety V of dimension r, which we will assume, for the sake of 
simplicity, to be smooth, projective and defined over Q. We know (see 
Proposition B-1.22) that for p outside of a finite set S, the reduction mod- 
ulo p of V remains smooth; we will denote it by Vp, and it is a projective 
variety defined over F,. We then have a natural definition (see the first 
section of this chapter) for the zeta function of V/Q by omitting the Euler 
factors for p € 9°: 


¢s(V/Q, 8) = [J 2(%)/F pp) = [I [] Pi /Fp. pO? . (6.48) 


pés J=0 p€S 


This suggests that we let 


L;,s(V/Q, s) = |] Pi(V,/Fp,p*)} = T] To —apip7*). (6.49) 


pEs pES i=l 


Then we have ¢(V/Q, s) = []?") L;(V/Q, 8)". Since |a,;| = p4/?, the 
Euler product is convergent for Re(s) > 1+ 7/2. 
It is always true that Lo(V,s) = ¢(s) and Lo,(V,s) = ¢(s — r), since 
Po(Vp,T) = 1—T and P2,(Vp>,T) = 1— p"T. By using relation (6.14), we 
see that 

L2,-i(V/Q, 8) = L,(V/Q, Ss—T+ a) 


The zeta function of a curve C/Q is written 


C(s)¢(s — 1) 
Li(C/Q,s) 


5There exists a more sophisticated procedure than the theory introduced here for 
defining the local factors for all p—see for example [65]. 


¢(C/Q, s) = (6.50) 
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Therefore, we see that what we called the “Z-function” of an elliptic curve 
E/Q in Chap. 5 is denoted here by L1(F/Q,s). 


We will reformulate Wiles’s theorem (Theorem 5-7.6, the Shimura-Taniyama- 
Weil conjecture) in two (nontrivially) equivalent forms. 


6.14. Theorem. Let E/Q be an elliptic curve with conductor N = Ng. 
1) There exists a modular cusp form f € S2(To(N)) such that 


L(E,s) = L(f,s). (6.51) 


2) There exists a non-constant morphism dg: Xo(N) > E. 


Commentary. Wiles actually proved this result with some supplemen- 
tary hypotheses, which were subsequently shown to be unnecessary. Some 
extraordinary features of this result deserve to be pointed out. The func- 
tion L(E, s) is constructed starting with local information—actually it only 
suffices to know card E(F,)—and the theorem indicates that the obtained 
L-function comes from a global object—a modular form—which determines 
its characteristics. The link between these two objects, the elliptic curve 
defined over Q and the modular form for ['o(V), is achieved through the 
Galois representations associated to each of these objects (see Appendix C). 
The existence of such a link is actually suggested by the L-series associated 
to their functional equations (proved or conjectured). This program has 
been vastly generalized and is today called the Langlands program. With- 
out being able to explain the details (see for example [20] and [82] for an 
introduction and references), we will only say that Langlands theory as- 
sociates to each irreducible automorphic representation a function D(z, s) 
defined by an Euler product and which has a functional equation (relating 
L(x,s) and L(i,1—s)). These representations are obtained as factors of 
the space L?(Zag GLn(Q)\ GL (Aq)) (here, Ag denotes the ring of ade- 
les, Z the center of GL, and Za, the points of the center with values in 
the adeles) and have infinite dimension (see for example [20] or [32]). Lang- 
lands conjectures, for example, that every Artin L-function associated to 
a representation of dimension n coincides with the L(z,s) function associ- 
ated to a representation of GL, (Aq). In this context, modular forms are 
associated to representations of dimension 2 (the group GL). 


This suggests that we thus describe what we should expect from a “nice” 
zeta function. 


Expected properties of zeta or L functions. 


i) They are defined by a Dirichlet series in a half-plane Re(s) > a: 


L(M,s)= > Ann *. 
n=1 
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ii) They are written, in a half-plane Re(s) > b, as an Euler product: 


s) = [[ 24 8) = Il (1 + dp ip? +++ + apap) 
Pp 
We call d the degree of the Euler product. Moreover, we require in 
general that 1+ a@piT +-+-+apaT? = idea! — Qp,jT) where, for 
almost every p, we have the equality |a, ;| = p’/? and, in particular, 
Gp, #0. The integer w is called the weight of M. 

iii) The function L(M,s) can be analytically continued to the complex 
plane, except for at a finite number of poles. It satisfies a functional 
equation of the type A(M, s) = wy,A(M,1—s), where wy, is a complex 
number with absolute value 1, 


A(M,s) = A? TT Pal s+t;)To(s + th)" L(M, s), 
j=l 


L(M,s) is a function of the same type and A ,t;,t;,hj,h; are some 
constants. 
Outside of a neighborhood of its possible poles, the function A(M, s) is 


bounded in every vertical strip 71 < Re(s) < a9. 


iv 


Ww 


6.15. Remark. With some optimism, we could add as a property the 
analogue of the Riemann hypothesis (abbreviated GRH): 


“The zeros of A(M, s) are situated on the line Re(s) = (w+ 1)/2.” (GRH) 


We should point out that the given hypotheses imply that the Euler product 
which defines L(M,s) is absolutely convergent for Re(s) > 1+ w/2 and 
non-zero in this half-plane. Therefore, A(M, s) does not vanish for Re(s) > 
1+w/2 and, because of the functional equation, for Re(s) < w/2. Just like 
the function ¢(s), the function L(M, s) has “trivial” zeros in the half-plane 
Re(s) < w/2, these being governed by the Gamma factors in the functional 
equation. Thus, the generalized Riemann hypothesis describes the location 
of the zeros in the “critical strip” w/2 < Re(s) < 14+ w/2. 


6.16. Conjecture. Spat Weil, see [65]) Let V/Q be a smooth projective 
variety. To every0 <j < 2dimV we can associate an integer A;, local 
factors Ly ;(V,s) at places p € S of bad reduction and a Gamma factor 
Lo, (V, 8) such that the product 


Aj(V, 8) = A*/?Loo5(V, 8) | [] Lpa(V, 8) | £s,3(V/Q, 5) 
pes 
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satisfies the previous properties and, in particular, the functional equation 


Ay(V/Q, 8) = +A,(V/Q,j +1 - 5). (6.52) 


This conjecture has only been proven in a few cases. 


Appendix A 


Factorization 


“‘Four thousand two hundred and seven, that’s the exact number,’ the King said, 
referring to his book.” 


Lewis CARROLL (THROUGH THE LOOKING GLAss) 


In this chapter we take another look at the factorization problem that we 
started in Chap. 2 by explicitly describing a method for factoring polynomi- 
als and by sketching two of the most powerful algorithms developed during 
the last two decades for factoring integers: an algorithm due to Lenstra 
which uses elliptic curves [49] and the number field sieve algorithm origi- 
nally due to Pollard (see [19]). For those who are put off by probabilistic 
or heuristic estimation methods, keep in mind that once a factorization is 
found, it is very quick and easy to check it. It would be appropriate to com- 
plete this introduction by citing [22], the reference for algorithmic number 
theory. Furthermore, most of these algorithms are already implemented and 


available, for example with the PARI/GP package. 


1. Polynomial Factorization 


We begin with the observation that it is fairly easy to find polynomial roots 
whose multiplicity is greater than 1: we just need to compute D(X) := 
gcd(P(X), P’(X)). We can therefore essentially concentrate on factoring 
polynomials without any multiple roots. 


There are many factorization algorithms in F,[X] (or even F,[X]). We 
present one of them, due to Berlekamp, which is very efficient as long as p 
is not too large. It is based on the following two lemmas. 


1.1. Lemma. Let f(X) € F,[X] be a square-free polynomial of degree n. 


M. Hindry, Arithmetics, Universitext, 259 
DOI 10.1007/978-1-4471-2131-2, 
© Springer-Verlag London Limited 2011 
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We define the n x n matrix, B = ((bi,;))o<i,j<n—1, by the polynomials: 
bog + b1gX Hess + bn-1,jX"* =X" mod f(X). 
Then for any polynomial h(X) = ho +hiX +-+:+hn-1X"", the following 


are equivalent: 
ho 
h(X)1—h(X)=0 mod f(X) @& (B-I){ : | =0. 
An-1 


Furthermore, the dimension of Ker(B — I) is the number of irreducible 


factors of f. 


Proof. The map from F,[X]/(f(X)) onto itself given by P +> P? is linear 
and B is by definition the matrix of this lear transformation in the basis 
1,...,X"~1. The first statement follows directly from this observation. For 
the second, write f as the product f = f,--- f,. Then 


Fy[X]/(f(X)) = FqlX]/(fi(X)) x +++ x Fal X]/(f-(X)). 
The equation h(X)? — h(X) = 0mod f(X) translates to h(X)? — h(X) = 
Omod f;(X), for =1,...,r. Since F,[X]/(fi(X)) is a finite extension of 
F,,, this is therefore equivalent to h(X) = A; mod f;(X) with \; € Fy. We 
therefore have a total of g” solutions, which proves that the vector space of 


solutions has dimension r. 


1.2. Lemma. Let f(X) € F,[X] be a polynomial of degree n and let h(X) 
be a polynomial of degree < n—1 such that h(X)4 — h(X) = Omod f(X). 
Then we have the following factorization: 


F(X) = [J ecd(f(X), A(X) - o). 


ceF, 


Proof. The product on the right hand side clearly divides f(X). Conversely, 
since Ilex, (X —c) = X12 — X, we see that f(X) divides h(X)4 — h(X) = 


Teer, (U(X) — ©). 


Let us now summarize the steps of a factorization algorithm for polynomials 
in F,[X]. By calculating gced( f(X), f’(X)) and factoring it out, we are back 
to the case where f is square-free. By applying the division algorithm to 
X/% divided by f(X), we construct the matrix B from Lemma A-1.1, and 
we calculate a solution to the linear system which gives a polynomial h(z). 
Finally, we successively calculate gcd(f(X), h(X) — c) for c € Fy until we 
find a nontrivial factor. 
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Even though we now see that there exists an algorithm for factoring poly- 
nomials in F,[X], it is less easy to see in ZX]. It is nevertheless a priori 
possible to bound the size of a factor of a given polynomial. One such 
bound is given by the Gel’fond inequality or by the following lemma. 


1.3. Lemma. Let P(X) = po +p. X +---+paX? and Q(X) =qtauXx+ 
---+qeX* be two polynomials with integer coefficients. If Q divides P, then 


Z . 1/2 
Sligo" (>. mt ; (A.1) 
j=0 i=0 


Proof. We define the Mahler measure of a polynomial P=ag(X—ay)-+- (X— 
aq) to be 


1 d 
M(P):= exp | log |P(e?*"')| dt = |ao| ][[ maxc., |ax|), 
0 i=1 


where the second equality is equivalent to the formula Je log |e?" — a| dt = 
log max(1, |a]), which is well-known and is a particular case of Jensen’s for- 
mula. We clearly have M(PQ) = M(P)M(Q), and for all polynomials with 
integer coefficients, M(P) > 1. By using the relation between coefficients 
and roots, we can show that 


d 
irl <leol | San e-ay] < ({) 0) 
Ji<c<jk 
Jensen’s convexity inequality gives 
1 
M(P) := exp $ | log | P(e2"") |? at (A.2) 
0 


< ( i Per) at) — ps nt - (A.3) 


We can therefore suppose that we have a factorization P = QR. Then we 
have 


and can therefore conclude that 


é d 1/2 
S"laj| < 2°M(Q) < 27M(P) < 24 os mt 
j=0 i=0 
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To make things simpler, we will assume that f(X) € Z[X] is monic. From 
a theoretical point of view, we may make this assumption without loss 
of generality because if f(X) = aX¢+--- € Z[X], then the polynomial 
f(X) := at" f(X/a) = X44... will be monic with integer coefficients, 
and a factorization of f will give a factorization of f. In order to find a 
factorization f(X) = fi(X) fo(X), let us start with the trivial remark that 
one such factorization stays the same modulo N. We can therefore start 
by factoring f(X) in F,[X] for some primes p and compare the degrees of 
the factors. If by any chance we find that some degrees are incompatible, 
we can use this to prove that f(X) is irreducible. In general, we proceed 
by making use of the following variation of Hensel’s lemma. 


1.4. Lemma. Let f(X) € Z[X] be monic and p prime. Assume that 
f(X) = fi(X)qi(X) mod p, with ged(fi,g1) = 1 in F,[X]. Then for all 
m > 1, there exist two monic polynomials fm,Gm € Z[X] such that fm = 
fi mod p, Gm = gi modp and 


F(X) = fm(X)gm(X) mod p™. 


Proof. The hypothesis that the reductions modulo p of f; and gi are 
relatively prime is essential and is used in the assertion that there exist 
U,V € Z[X] such that Uf; + Vg, = 1modp. Let us suppose that we 
have constructed the polynomials f,, and gm so that f = fmgm + p™C. 
In looking for polynomials in step m+ 1 of the form fim4i = fm + pA, 
Imt+1 = Im +p” B, we find that f _ fim4i19m+1 = pr(c = Brim a gmA) + 
p?™ AB = p™(C — Bf, — Agi) mod p™*". It follows from the initial remark 
that there exist A,B € Z[X] such that Bf; + Ag: = Cmodp. 


The factorization algorithm can be described as follows: a bound B for 
the size of the coefficients of a possible factor is computed according to 
Lemma A-1.3. Then f is reduced modulo a prime p, and the result is 
factored using the preceding algorithm (if a multiple root is found, then 
reduce modulo another prime). The factorization modulo p is lifted to a 
factorization modulo p™ where m is chosen so that p™ > B. After that, 
we check to see if that factorization comes from a factorization over Z. 


You can refer to Cohen’s book [22] for a more detailed discussion of this 
algorithm and its variations. 


2. Factorization and Elliptic Curves 


Let us start by presenting a relatively inefficient algorithm (in certain 
cases), but which lends itself well to being generalized: Pollard’s “p — 1” 
method. 
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2.1. Definition. Let Y be an integer. A number is said to be Y-smooth 
if all of its prime divisors are smaller than Y. A number is said to be 
Y -powersmooth if every prime power which divides it is smaller than Y. 


Let N be a number which we would like to factor and p a prime divisor 
of N. Suppose that p—1 is Y-powersmooth for some Y which is not too 
large. Then p—1 divides m(Y) := lem(2,3,...,Y). If a is an integer which 
is relatively prime to N, we would have a”) — 1 = 0modp, and therefore 


ged (amr) i, N) Ese 


which would very likely produce a non-trivial factor of N. 


The method would therefore be efficient if N had a prime factor p such that 
p—1is Y-smooth (or Y-powersmooth) for some Y which is not too large. 
The problem with this method is that large prime numbers p where p — 1 
is Y-smooth are fairly rare (see Proposition A-2.3 for an estimate of how 
rare). Likewise, one could hope to find an a which has a period significantly 
smaller than p— 1. It is not difficult to see that this case is likewise fairly 


rare. The key idea of Lenstra’s algorithm is to observe that we are actually 
” 
the same type of reasoning to other groups of varying cardinality, we would 


have a better chance of factoring N. This is precisely what elliptic curve 
theory gives us. 


If # is a curve over F, and P € E(F,), its order np divides card E(F,) € 


[p+1—2,/p,p+1+2,/p]|. If card E(F,) is Y-powersmooth, we analogously 
have 


working in F*, which is cyclic of cardinality p— 1. If we can then apply 


Now we have the advantage of being able to try multiple elliptic curves. 
We only need to find sufficiently many curves such that the orders of their 
groups of rational points are Y-smooth. Let us see how this procedure, 
once properly formulated, gives a factorization algorithm. 


We first need to discuss points in the projective plane or on an elliptic curve 
over Z/NZ. Let us point out that we do not lose anything by assuming 
that the integer N that we would like to factor is not divisible by 2 or 
by 3. It is not difficult to generalize the notions considered over a field. We 
do it in an ad hoc manner. 


2.2. Definition. The projective plane over Z/NZ is defined as the quo- 
tient & := {(x, 271,22) € (Z/NZ)? | gcd(xo, 21, 22, N) = 1} by the rela- 
tion (x9, 21, ©) ~ (uxp, UL), UL2) for u € (Z/NZ)*. 
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If we assume that gcd(N,6) = 1, an elliptic curve over Z/NZ is given by 
the Weierstrass equation, 


zy? = «° + axz? + bz? with a,b € Z/NZ and 4a? 4+ 27b? € (Z/NZ)*. 


We will now carry out the calculations on the points of this elliptic curve, 

using the addition formulas 5-1.7. It seems like this might be problematic, 
because (for example) in order to add P, = (21, y1) and P: = (x2, y2) one 
should invert (21 — x2), but if we keep in mind that we are trying to factor 
N, it is enough to calculate gcd(#1 — x2, N) and to observe that if this is 
nontrivial, we have found a factorization. Another possibility would be to 
use the projective coordinate formulas, and in this case, we would verify 
that E(Z/NZ) is a group. It turns out that from an algorithmic point of 
view, it is more economical to do the calculations in affine coordinates. 


It is important to notice that the computation of [m](P) does not require 
m addition steps (which would be restrictive), but by quick exponentiation 
only needs O(logm) addition steps (or duplication). Here, logm(Y) = 
WY) ~Y. 

In order to effectively construct an elliptic curve and a point modulo N “at 
random”, we could use many methods. One of the simplest ones consists of 
randomly choosing three integers modulo N, say xo, yo and a, and setting 
b := yg — x3 — azo. One needs to check that A := 4a? + 27b? is invertible 
modulo WN (if it is not, we have almost surely found a factorization of N), 
and we have a point P = (a9, yo) on the elliptic curve y? = 2° + ax + b. 
We therefore try to calculate [m(Y)]P; if the calculation does not produce 
a factor of NV, we choose a new elliptic curve and start over. Notice that 
we can carry out the calculations on many elliptic curves simultaneously. 


To analyze the performance of these algorithms, we need to estimate the 
number of Y-smooth integers. The following useful proposition is due to 
Canfield, Erdés and Pomerance [21]. 


2.3. Proposition. If2<y <2, we define: 
w(a,y) = card{n < «| n is y-smooth} , 
and let u:=logx/logy. Then we have the formula 


O(a, y) = zu-¥Oto) | 


where the term o(1) tends uniformly to 0 as x tends to infinity, and for a 
given € > 0, y satisfies: (log x) < logy < (log x)!~*. 
In particular, if y := exp [C (log x)” (log log )°| , we have: 


(x,y) = wexp [- * (log 2)!P(log log 2)'~*(1 + o(1)) 
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The statement given here corresponds to the domain u€ |(log «)*, (log z)'~*]. 
We find a precise description of the asymptotic behavior of the function 
(x,y) in Tenenbaum’s book [72]. 
Actually, we would need to know how many Y-smooth numbers an interval 
of type [p—2,/p, p+2,/p] contains; in other words, we need a lower estimate 
of the value ~(p+ 2,/p, Y) — Y(p— 2\/p, Y). We do not know how to prove 
the estimate given below, in the same way that we do not know how to 
prove the existence of prime numbers in very small intervals, but it has 
been confirmed experimentally. 
Set L(x) := exp Vlogxloglogx. Proposition A-2.3 says that the proba- 
1 
bility that a random number < z is L(x)*-smooth is L(x)~ 2a 7°), It is 
therefore natural to conjecture that this statement is still true on a suffi- 
ciently large interval. 


2.4. Conjecture. The ratio of L(x)*-smooth numbers in the interval 


1 
[xz — fz, c+ Ja] is > L(x) 2a 1” 


If the conjecture is true, in order to find an elliptic curve where the number 


1 
of points over F,, is L(p)*-smooth, we have to try L(p) 2a T°) of them. 
We should perform L(p)* operations on each curve, hence a complexity on 
the order of L(p)*+!/2¢. By choosing a = 1/2 (in other words Y on the 


order of L(p)'/¥2), we therefore obtain a complexity on the order of 
L(p)¥? = exp \/2 log plog log p. 


The complexity of the algorithm depends on the size of the smallest factor 
of N. This property is not very useful for factoring RSA type numbers but 
is a major advantage for most other integers. 


Another property of the algorithm is that it does not require too much 
memory: in fact, we only have to save data which are polynomial in log NV. 


3. Factorization and Number Fields 


We will sketch the number field sieve algorithm originally suggested by 
Pollard and developed by Buhler, Lenstra and Pomerance (cf. [19]). 


We are looking for an irreducible, monic polynomial f(X) € Z[X] and an 
integer m such that f(m) = 0OmodN. One handy and efficient method is 
to choose an integer d (usually 2 < d < 5), then to look at m := [|Ni/4], 
to write N in base m (in other words calculate a; € [0,m — 1] such that 
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N=agtaym+-+-+agyme1+ m4) and to choose 


F(X) Sop tark hie bagi EX: 


Let us point out that the fact that the m expansion starts with m®@ is equiv- 
alent tom? < N < 2m?—1. The first inequality is true by construction 
and the second will be true if, for example, N > 2%¢+))_ Finally, there is 
no saying a priori that f(X) will be irreducible, but if that is not the case a 
factorization f(X) = fi(X) fo(X) will give N = fi(m) fo(m), which would 
be of course exactly what we want. 


We will now construct a ring A:= Z[X]/(f(X)) = Z[a] (where a is a root 
of f) and its field of fractions K = Q(a). The idea of the algorithm is to 
look for a set S of pairs of integers (a,b) such that 


i) Ta,syes(@ + bm) is a square (in Z), 
li) [Ta pyes(@ + ba) is a square (in Z[a)). 


If we have succeeded in doing that, we can consider the ring homomorphism 
6: A— Z/NZ given by ¢(a) =m. We then find 6 € Z[a] such that 6? = 
T](as)es(a + ba), then $(@) and u € Z such that u? = [Tq pyeg(a + bm). 
By construction, (3)? = u? in Z/NZ. We then compute gcd(¢(3) + u, N) 
and gcd(¢(3) — u, N), which will very likely give us a factorization. 


The main difficulties are, on the one hand, to construct a set of pairs (a, b) 
which satisfy the conditions i) and ii) and, on the other hand, to compute 
the square root of 7 := [[(q»)e9(@ + ba) in Z[al. 


To find a “simultaneous root”, the idea is to choose a parameter Y, then to 
choose (by way of a number field sieve) integer pairs (a,b) such that a+ bm 
and a+ ab are Y-smooth. We define an algebraic integer y € Zia] to be 
Y-smooth if NG(1) is itself Y-smooth. Having constructed a large enough 
set, say T, of pairs (a,b) (we need card(T) to be greater than 7(Y)), we 
perform Gaussian elimination over F2 in order to find an adequate subset 
S. Initially, we will get an S such that 


II (a+bm) and NG II (a + ba) are squares (in Z). 
(a,b)ES (a,b)ES 


This of course is not enough to guarantee that 7 := [[(q,)<5(@ + ba) isa 
square. Now we see how to refine the number field sieve so that at least 
the ideal generated by y is a square. In general, the fact that the norm of 
an ideal is a square does not at all imply that the ideal is a square. We 
can however take advantage of the particular form of the algebraic numbers 
that we have produced. 
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3.1. Lemma. Let p be a prime number which does not divide (Ox : Z[al). 
An ideal p above p and which divides a+ ba is an ideal which has norm p 
(i.e., of degree 1) and corresponds to the root rmodp of f(X)modp such 
that a + br = Omod p. 


Proof. In fact, according to the hypothesis, if the factorization in F,[X] is 
written as f(X) = fi(X)"--- f,(X)° € F,[X], this corresponds to a de- 
composition pOx = p§!--- pg’ with N(p;) = p*8\) (see Exercise 3-6.20). 


We therefore refine the decomposition of a + ab by introducing 
R(p) == {r € Z/p2 | f(r) = 0 mod p} 
and the exponent corresponding to each r € R(p): 


ord, N(a+ ba) ifa+br=Omodp, 
0 if not. 


€p,r(a + ba) = 


Therefore, we have (a+ba)@x =]],, I] -crip) perere (ignoring the factors 
where p divides (@x : Z[a])). Hence N(a + ba)=+]], TL -cr(p) porn (atba), 
Most importantly, the ideal generated by y := (a4)e5(@+ ba) will be a 


square if and only if 


a €p,r(at+ ba) =Omod2 (for every p and r € R(p)). 
(a,b)ES 


The fact that y@x is a square (of ideals) does not always imply that ¥ is 
a square, but we have gotten closer. In order to measure how close, we 
introduce the group 


© := {7 € K* | there exists a fractional ideal & such that (y) = #7}. 


If we denote by Clx the ideal class group and Céx[2] the subgroup of 
elements killed by 2, we have the following exact sequence: 


O— > 62. /60? —+ @/K*? — Clx[2] + 0 


(7 € @ maps to the class & such that #? = y@x). In particular according 
to the unit theorem (Theorem 3-5.6), we have 


rank @/K*? = 1, + rz + rank Cl [2]. 
The computation Cx [2] is simply too large to carry out. However, in order 


to increase the chances that 7 is a square, we could calculate a small number 
of generalized Legendre symbols: we choose some prime ideals p,...,Ps5 
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and compute 


( ay ) +1 if 7 is a non-zero square modulo p, 

~ |-1~ if y is not a square modulo p. 
In this way, we can refine the sieve that produces candidates for squares by 
insisting that ]](.4)¢g(@ + bm) is a square in Z and that [[(,4)<5(@ + ba) 
generates a square ideal in Ox and (4) =+1 fori=1,...,s. 

a 

We need to observe that the “sieve” part will use a lot of memory: we 
should in fact calculate and save the prime numbers smaller than Y, then 
arrange the pairs of numbers a+ bm for a and b in a chosen interval, test 
their divisibility by primes smaller than Y keeping only those which reduce 
to +1, and then start over again to sieve the NG (a + ba). 


In order to compute the square root of an algebraic integer y € Z[a}, let 
us suppose (to make things simpler) that we know its minimal polynomial 
F(X) € Z[X]. Observe that if G(X) is the minimal polynomial of \/7, 
then G(X) and G(—X) divide F(X?). This suggests that we should use 
the following procedure. 


— We factor the polynomial F(X?) = G(X)G(—X) (in Z[X)); 


— We perform division algorithm G(X) = (X? — y)Q(X) + R(X) in 
Z{a)|X]. 


— If R(X) = aX +, set 6 = —b/a. 


Then 3? = 4. 

Let us point out that the polynomial F(X?) is necessarily factored accord- 
ing to its given form and that if the remainder R(X) is constant, then 
the number y¥ is not a square! That is to say that we have made a false 
assumption (for example, we might have neglected the index (@x : Zla]) 
or might have not compensated enough for @/K**). We therefore need to 
start all over again with another set S. 


Let us also point out that in general Z[a] is not integrally closed and \/7 
might not be an element of Z[a]. Nevertheless, it is easy to overcome this 
obstacle: in fact, if f is the minimal polynomial of a, then f’(a)?@K C 
Zl], and we can safely replace y by f’(a)?¥. 

We refer you to the original article [19] or to [22] for an analysis of the 
complexity, which, modulo a “reasonable” conjecture, is on the order of 


O (exp(C(log N)*/3 (log log w)?/*)) : 


For very large numbers which do not have any medium-sized factors, this 
algorithm is therefore more powerful than the elliptic curve algorithm. 
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3.2. Remark. (Factorization and quantum computers) We have seen in 
Exercise 2-7.15 that a quick calculation of the period (or order) of amod N 
will provide a fast factorization algorithm. In 1997 Shor [69] proved that 
if one had access to a “quantum computer”, one could calculate this period 
in polynomial time. It is not known whether such a computer (with the 
required properties) could be built, but this discovery stimulated a field of 
research which is still very active. 


Appendix B 


Elementary Projective 
Geometry 


“La linea consta de un ntimero infinito de puntos; el plano, de un ntimero infinito de 
lineas; el volumen, de un ntimero infinito de planos; el hipervolumen, de un ntimero 
infinito de volimenes. . . 
is Z Pree ; aoa . a 

No, decididamente no es éste, more geométrico, el mejor modo de iniciar mi relato. 


JorcGE Luis Borcss (EL LIBRO DE ARENA) 


We will give an introduction to projective algebraic geometry: lines, conics, 
quadrics, cubics and Bézout’s theorem on the number of points of intersec- 
tion of two plane curves. We will clarify the notion of smoothness from a 
purely algebraic point of view. The projective context allows us to intro- 
duce the notion of “reduction modulo p” of a rational point on an algebraic 
variety. We will finish with some allusions to intersection theory. 


1. Projective Space 


1.1. Definition. Let K be a field. The affine space over K of dimension 
n, denoted A” or A”(K), is the set K”. The projective space over K 
of dimension n, denoted P” or P”(K), is the set of lines through the 
origin in the vector space E = K"*! or the quotient of K"*1 \ {0} by 
the equivalence relations (20,..-,%n) ~ (Yo;---;Yn) if there exists u € K* 
such that (xo,...,%n) = (uyo,---,UYn). If P is the equivalence class of 
(%o,---,;%n), we say that (xo,...,@,) are projective coordinates of P, and 
we simply write P = (x,...,2n). 

M. Hindry, Arithmetics, Universitext, 271 
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1.2. Remark. The value of a polynomial at P = (a9,...,%,) € P” does 
not have any meaning, but if the polynomial F'(xo,...,2Zn) is homogeneous 
the fact that F(xo,...,2%n) =0 or F(2xo,...,@n) # 0 is independent of the 
projective coordinates. This allows us to make the following definition. 


1.3. Definition. An algebraic subset of A” (resp. of P”) is the set of 
common zeros of a family of polynomials (resp. homogeneous polynomials). 
A linear subvariety or linear subspace is the set of zeros of a family of 
homogeneous linear polynomials. The dimension of a linear subvariety of 
P” is the dimension of the corresponding vector space minus one. A conic 
(resp. cubic) in P? is the set of zeros of a homogeneous polynomial in 
(%o0,%1,%2) of degree 2 (resp. of degree 3). 


1.4. Remark. More generally, we can define the dimension of an algebraic 
subset V C P” as follows: let s be the maximal dimension of a linear 
subvariety L such that VN L = 0, then dimV := n—s—1. We will 
freely use the natural vocabulary of calling a curve an algebraic subset of 
dimension 1 and a surface an algebraic subset of dimension 2. 


1.5. Proposition. Let L, and Lz be two linear subspaces of dimension 
n, and ng such that ny + ng >n. Then Ly 1 Le is non-empty. More- 
over, this intersection is a linear subspace of dimension > ny + n2— Nn. 
If the dimension is equal to ny + ng —n, we say that Ly and Lz intersect 
transversally. 


Proof. Consider the map 7: K"*+ \ {0} — P"(K). The linear subspaces 
L; are images of vector subspaces (minus the origin) F; of dimension n; +1. 
From linear algebra, we know that dim(£,N £2) > (ny +1)+(n2+1)—(n+1); 
thus F := EF, 9 E2 is a vector subspace of dimension > nj + ng —n+1, 
and consequently the image [1M Ly = a(F \ {0}) is non-empty and of 
dimension > nj + ng —7n. 


Remark. This statement contains the classical fact that two lines in the 
projective plane always either meet at one point or are coincident. 


The following procedure, called a Segre embedding, allows us to consider the 
product of projective spaces or projective varieties as a projective variety. 


1.6. Proposition. (Segre embedding) The map S: P"xP™ — Ppmrrmern 
given by ((Zo,---,2n), (Yo, ---,Ym))  (Liys)o<i<no<j<m 18 @ bijection be- 
tween P” x P™ and an algebraic subset of P°™t™*”, 


Proof. Let z;,; be coordinates of P™"*™t". We can immediately see that 
the image of S is contained in the variety defined by 2,524.0 — 2i02n,j3 = 0. 
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Conversely, let a point R with coordinates z;,; satisfy these equations. If, for 
example, zo,9 # 0, then we can assume that zo,9 = 1, hence, z%,¢ = Z%,020,¢; 
and therefore R = S((1, 21,0,---; 2n,0)(1, 20,1,--+;20,m))- 


The affine space A” can be seen as a subspace of P” by considering 
(%1,---,%n) + (1,%1,-.-,2n). The image is the subset 


Uo = {(X0, £1, - a8 ,n) eP* | v0 # O}. 


The complement is the hyperplane x) = 0, which can be viewed as P™~! 
and which is often called “the hyperplane at infinity”. We can therefore 
write: 


P°=A°UP™I=A A"™UA™ 1 Y--- DU ALU AS. 


We can actually cover the projective space by open affine sets by setting 
U; := {Pe P”" | x;(P) £0}. We see that P” = Up U--- UU, and that, on 


the one hand, the map from A” to U; given by the formula (#1,...,2n) 
(a1,..-,%;,1,@i41,---,2n) and, on the other hand, the map from U; to 
A” given by (0,.--,2n) 2 (@0/%i,--.,Ui-1/@i, Vi41/Vi,..-,Ln/Li) are 


reciprocal bijections. 


1.7. Definition. The Zariski topology on A” (resp. on P”) is the topology 
whose closed sets are algebraic subsets, i.e., common zeros of a family of 
polynomials (resp. homogeneous polynomials). 


We can immediately verify that it is indeed a topology: if V; is the set of 
zeros of a homogeneous ideal I; (generated by homogeneous polynomials), 
then V; U V2 is the set of zeros of the ideal I, Jo, and NicsV; is the set of 
zeros of the ideal 7, I;. Notice that the sets U; C P” are open and dense. 


1.8. Definition. An affine (or projective) algebraic subset V is irreducible 
if it is not possible to write it as the union of two closed proper subsets, 
ie, if V = V, UVo, with V; closed, then either V = V; or V = Vg. An 
irreducible algebraic subset is called an algebraic variety. 


We can easily show that every algebraic set can be written as a finite 
union of irreducible algebraic subsets. If we eliminate redundancies, this 
decomposition is unique, and the maximal irreducible subsets are called 
irreducible components. The following lemmas allows us to clarify this 
phrase which is often seen and used in geometry, “It suffices to verify this 
in the general case.” 


1.9. Lemma. Let V be an affine (resp. projective) variety and Z a proper 
algebraic subset (i.e., Z#V). Let F be a polynomial (resp. homogeneous 
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polynomial), and suppose that F vanishes on V \ Z. Then it vanishes 
on V. 


Proof. The ring of polynomial functions on V is integral since V is irre- 
ducible; this is true because if fg is zero on V and if we had f # 0 and 
g £0 on V, then we could deduce that we had a nontrivial decomposition 
V=(VO{f =0}) U(Vn {g = 0}). So let G = 0 be a nontrivial equation 
of Z, hence FG = 0 on V. Since G # 0 on V, we can indeed deduce that 
F=0OonV. 


We can define in an intuitive manner the morphisms or algebraic maps 
between two varieties as “maps defined by polynomials”. A more precise 
definition is as follows. 


1.10. Definition. Let V Cc A™ and W Cc A” be two affine varieties 
defined over a field K. A morphism or algebraic map f : V — W defined 
over K is given by n polynomials fi,..., fn € K[X1,...,Xm] such that for 
every x € V, the point f(x) = (fi(z),..., fn(a)) is in W. We say that f 
is an isomorphism if there exists another morphism g : W — V such that 
fog=idw and go f = idy. 

A rational map from V to W defined over K is given by n rational functions 
fi,---,fn © K(X1,...,Xm) (where V is not contained in the subset of 
poles of the f;) such that for every x in V, outside of the poles, the point 
f(x) = (fi(x),..-, fr(@)) is in W. Such a map is denoted 


fiVer ow, 


to indicate that it is not necessarily defined everywhere. The map f is 
called a birational map if there exists another rational map g:W--: ~ V 
such that fog = idw and go f = idy (wherever they are defined). 


Let V Cc P™ and W C P” be two projective varieties defined over a field 
K. A morphism or algebraic map f : V — W defined over K is a map 
f :V — W such that for every point x € V, there exists an affine open 
set U in V which contains x and an affine open set U’ in W such that 
fiv : U — U' is a morphism of affine varieties. 


Note that it is possible to globally define a morphism of affine varieties or 
even from P” to P” by polynomials, but in general, we need many charts 
to define a morphism of projective varieties. We will now give you some 
examples of isomorphisms. 


After linear subspaces, the most elementary algebraic varieties are the 
quadrics, i.e., the hypersurfaces defined by a homogeneous polynomial of 
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degree 2 which has the form 
Q(20,---;2n) = S- Qi,jUir;-. 


O<,t,j<n 


If the characteristic of the base field is different from 2, we can assume 
that qi; = a. The quadric is therefore nondegenerate if and only if 
det(Q) = det(qi,;) # 0. After a linear transformation, we can also assume 
that Q(xo0,...,%n) = aoxg+--+-+anx2. Note also that Q is either irreducible 
or a product of two linear forms. A reducible quadric is degenerate, but the 
converse is only true for conics in P?. We will now classify the quadrics (up 
to linear transformation) of P? and P® over an algebraically closed field. 


1.11. Theorem. Let K be an algebraically closed field (of characteristic 
#2). All of the nondegenerate projective conics over K are equivalent and, 
in particular, isomorphic to the conic given by the equation yoy: — y3 = 
0. The latter is isomorphic to the projective line by the map (2,21) 
(x3, 22, 21). 

If K is not algebraically closed and C is a conic, then C(K) = or C is 
isomorphic over K to P?}. 


Thus the usual classification of conics into ellipses, hyperbolas, and parabo- 
las is valid in real affine geometry, whereas in the projective plane over an 
algebraically closed field, there is only one conic. 


Proof. We will start by constructing a hyperbolic plane, i.e., a plane en- 
dowed with a basis in which the quadratic form is written Q(x, y) = cy. 
To do this, we choose an isotropic vector eg, then another isotropic vector 
e; not orthogonal to e9, and by adjusting by a scalar, we obtain an ap- 
propriate basis. We then choose a vector eg orthogonal to the hyperbolic 
plane and such that Q(e2) = —1. After a linear transformation, we indeed 
have in the new basis Q(x, 21,22) = @ov1 — x3. For the second statement, 
let P € C(K), and consider the set of lines passing through P. This set 
is parametrized by P!. We can easily verify (see the more general proof 
given below) that every line D passing through P intersects the conic in a 
second point Pp: the map D +> Pp provides the needed isomorphism. 


1.12. Theorem. Let K be an algebraically closed field. All nondegenerate 
projective quadrics from P® to K are equivalent and isomorphic to P! x P! 
by the Segre map 


((%0, 1) (Yo, Y1)) +> (LoYo, L1Yo, ToY1, C141); 


whose image is the quadric given by the equation 20,021,1 — 20,121,0 = 0. In 
particular, the surface is ruled in two ways. 
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Proof. After a linear transformation (over kK which is algebraically closed), 
we can effectively assume that Q(xo, 1, 2,03) = %9%1—X243 (this amounts 
to writing the space as a direct sum of two hyperbolic planes). The isomor- 
phism of the quadric to P! x P! is thus a particular case of Proposition 
B-1.6. 


The following two lemmas are special cases of Bézout’s theorem, proven 
further down (Theorem B-2.4). 


1.13. Lemma. Let C be a curve of degree d (i.e., defined by a homo- 
geneous polynomial of degree d) in the projective plane and not containing 
the line D of P?. Then COD is composed of d points (counted with mul- 
tiplicity). 


Proof. Let F(ao,21,%2) = 0 be the equation of degree d of C and agxo + 
a1%1 + a2%2 = 0 that of D. One of the a; is non-zero, so we can take it 


to be ag. The equation of points of intersection of C and D is therefore 
: a a 
written zp = a _ Go 22 and 


which factors as a]],(aiv1 — Bjx2)"* with S0,m,; = d. 


1.14. Lemma. [fC is a curve of degree d in the projective plane with no 
components in common with the conic D of P?, then CN D is composed of 
2d points (counted with multiplicity). 


Proof. If the conic is composed of two lines, this lemma can be deduced from 
the previous lemma. We can thus assume that the conic is irreducible. Up 
to a linear change of coordinates, we can assume that the conic is written 
as £129 — x3 = 0 and hence that it is parametrized by the map from P! to 
P? given by (yo, 41) + (ye, ¥7, Yoy1). Let F(x, 21, 2) = 0 be the equation 
of C’. The equation of the points of intersection of C' and D is thus written 
P= (y, Yt, Your) and 
F (yo, Yi, your) = 0, 


which factors into a]],(aiy1 — Giyo)”*® with }7,m,; = 2d. 


Notation. We denote by S,,q the vector space of homogeneous polynomi- 
als of degree din %0,...,%n, and if Pi, ..., P, are points of P”, we denote 
by S,a(Pi,...,P,) the subspace of 5, q formed of polynomials which van- 
ish at each P;. 


1.15. Definition. A linear system of hypersurfaces S of degree d in P” 
is a vector subspace S' of S), q. 
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The set of hypersurfaces corresponding to the polynomials of $ can be 
seen as a linear subvariety of dimension dim(S) — 1 in the projective space 
corresponding to Sia. 


1.16. Lemma. We have the following formulas: 


n+d 

d 
The lemma is obvious by noticing that vanishing at point P is a linear 
condition on the coefficients of a polynomial. The computation of the 
exact dimension of S;,¢(Pi,...,P,) can however be tricky. 


dim S,.4 = ( ) and dim S;,a(Pi,...,P,) > dim S,4q— 1. 


1.17. Examples. We have 


d+2)(d+1 Pree 
(d+ i Sd Se i +1) 


and, in particular, dim S22 = 6 and dim $22(Pi,...,P-) >6-—r. Thus 
there always passes at least one conic through any five given points. We 


dim Sod = 


can specify under which conditions such a conic is unique. 


1.18. Lemma. Through any five points P,,...,Ps5 in the projective plane, 
there always passes a conic. Furthermore, if no four of the points are 
colinear, the conic is unique, i.e., dim S2.2(Pi,..., Ps) =1. 


Proof. We will first treat the case where three of the points, P,, Py, Ps, 
are colinear. The conic must contain the line L = 0 defined by the three 
points. Hence, we have 529(Pi,...,P5) = LS21(P1, Ps) since Py and Ps 
are not on the line L = 0. There is only one line which passes through 
Py and Ps, hence dim $2.1 (P1, Ps) = 1 and dim 59,2(Pi, os ., Ps) = 1. We 
will now treat the case where no three of the P; are colinear. Suppose 
dim $22(Pi,...,Ps) > 1, and let Pg be a point distinct from Py and Ps 
on the line L = 0 defined by these two points. We would then have 
dim S22(Pi,..., Ps) = 1, and a corresponding conic containing P,, Ps, Ps 
must contain the whole line hence be composed of two lines, and then 
P,, Py, P3 would be colinear. 


The dimension of S23 is 10. Therefore, there is always a cubic passing 
through any nine points in the projective plane plane. If 4 of these points 
are colinear, the cubic must contain the corresponding line, and if 7 of these 
points are on the same conic, the cubic must contain the corresponding 
conic. 


1.19. Definition. A point P = (a,...,@,) on a hypersurface V = {P € 
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P” | F(P) = 0} is singular if ge 
ae 


a 
V is singular if such a point exists and smooth if not. 


(P) =0 for0 <i<n. The hypersurface 


1.20. Remark. We can define the notion of smoothness for subvarieties 
of any dimension. To do this, let V C P” be a subvariety of dimension m 
and codimension r := n — m. If I is the ideal of polynomials vanishing on 
V and Y a finite generating set of J, a point P in V is smooth if the rank 


of the matrix 
OF 
(are) 
i 0<i<n, FEY; 


is equal to r (it is always < r). The projective tangent space (at a point P 


n OF 
iar Aq, Pei = 0. The map 


a 
which associates to a nonsingular point its tangent hyperplane is classically 
known as the Gauss map. 


on a hypersurface with equation F' = 0) is > 


1.21. Definition. The map “reduction modulo p” is defined from P”(Q) to 
P”(F,,) as follows. If P € P”(Q), we choose coordinates x; € Z such that 
gced(p,£o,...,%n) = 1, and we set r,(P) = (Zo,...,@n) where @ designates 
the class of z in F, = Z/pZ. If V is a projective subvariety of P” defined 
over Q, we define V to be the “reduction modulo p” of V as follows. Let 
Iv be the ideal in Q[2o,..., 2%] of polynomials which vanish on V, [yz := 
Iy NZlao,...,@n] and Ty the image of Iy.z in F,[xo,...,2%n]. Then V is 
the subvariety of P” defined by Jy. 


We should point out that if P € V(Q), then P € V(F,), and this property 
is specific to closed (projective) varieties. For example, if V is a curve in P? 
with equation F(X, Y, Z) = 0 and U is the affine curve F(x, y, 1) = 0 (seen 
as an open set in V), a point P which is in U(Q) has reduction modulo 
p, denoted P, and there is no reason that this should be in U, the affine 
curve with equation F(a, y, 1) = 0. In fact, P €U if and only if x and y 
are p-integers, i.e., if P is a p-integral point of U. 


1.22. Proposition. Let V be a smooth subvariety of P”(Q). For all 
p, except for a finite number, the subvariety V is a smooth subvariety of 


P"(F,). 


Proof. If V is a hypersurface given by F(2o,...,2%n) = 0, the hypothesis 
that the hypersurface F(a) = 0 is smooth means that the resultant, R, of 
the partial derivatives of F' is non-zero; the latter R can be expressed as 
a polynomial in the coefficients of F. Therefore, the hypersurface remains 
smooth modulo p for all the primes numbers not dividing R. 
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2. Intersection 


The theorem below is fundamental to classical algebraic geometry. It 
shows, in particular, that there is a bijection between algebraic subsets 
and reduced ideals (i.e., such that if a power of an element is in the ideal, 
then the element itself is in the ideal). 


2.1. Theorem. (Hilbert’s Nullstellensatz) Let P,,..., Pm be polynomials 
in K[X4,...,Xn] where K is algebraically closed. If Q is a polynomial 
which vanishes at the set of common zeros of the P;, then a power of Q 
is in the ideal generated by the P;. In other words, there exist r > 1 and 
A; € K[X1,...,Xn] such that 


Q” = Ai Pi +-:-+AmPrm. 


Proof. (Sketch) The key algebraic lemma is the fact that a finitely gen- 
erated K-algebra which is a field must be algebraic over K, hence equal 
to K if K is algebraically closed (see for example [43]). We thus consider 
the polynomials P,,...,Pm,1—TQ€ K[M,...,Xm,T]. According to the 
hypotheses these polynomials do not have any common zeros in K™*?, 
We now prove that this implies that they generate (as an ideal) the ring 
K[X1,...,Xm,T]. If this were not the case, they would be contained in a 
maximal ideal St, and the quotient K[X1,..., Xm,T]/Mt would be an alge- 
braic extension of K and would thus, by the lemma recalled above, be iso- 
morphic to K. If we let x; := X;mod II € K, we have constructed a com- 
mon zero (%1,...,;%m,t) € K™** to all of the polynomials of 9%. We can 
deduce from this the existence of polynomials U;(X,T) € K[X1,...,Xm,T] 
such that 


12UO RA) +006 PA PAU gl GTO), 


By interpreting this identity in K(X)|T], substituting T = 1/Q(X) and 
multiplying by Q(X)" where r = max deg, U;(X,T), we obtain 


Q"(X) = A(X) Pi(X) + +++ Am(X)Pm(X) 


ere A; — °U; es 


2.2. Remarks. 1) In the course of the proof, we proved that a maximal 
ideal of K[Xy,...,Xm] is of the form (X1 — a1,...,Xm — Gm). 

2)Ifk C K andif Q, P,,..., Pj, have coefficients in k, we can easily see that 
we can choose the A; to have coefficients in k. Likewise, if Q,P,,...,Pm 
are homogeneous, we can choose the A; to be homogeneous. 


In the case where the polynomials define a finite set, we can estimate its 
cardinality by the following theorem (see [35] or [31]). 
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2.3. Theorem. Let Z),...,Z, be hypersurfaces with degrees d,,...,dy in 
P”. The intersection Z,M---AZy, is non-empty, and if the intersection is 
finite, then the cardinality of this set satisfies 


card (Z1N-+-A Zn) < dy +++ dn. 


We can actually define multiplicities so as to obtain an equality in the 
previous assertion. We will do this for the case of two plane curve C1, C2 C 
P? without any common components. To define the multiplicity at a point 
P, we can work in the affine plane and consider f;(z,y) = 0 the affine 
equations of the C;. We define the local ring Gp at P = (a,b) as: 


Op :={F € K(a,y) |ordp(F) > 0} = S"'K[z, y] (B.1) 


where S is the multiplicative set of polynomials which do not vanish at P 
(or which do not appear in the ideal (x — a, y — b)). We then set 


mult(P; C1, C2) = dim Or/(fi, fo)p (B.2) 


where (f1, f2)p is the ideal generated by f, and fg in @p. The dimen- 
sion is well-defined whenever C and C2 do not have any common com- 
ponents containing P. The main properties of this notion of multiplicity 
are that it is positive, biadditive (meaning that if C; = C+ C’, then 
mult(P; C1, C2) = mult(P; C, C2) + mult(P;C’,C2)) and equal to 1 when- 
ever C; and Cy, intersect transversally at P (meaning that the tangents 
intersect transversally). In particular, we have 


mult(P; C1, C2) Sl1lsPEecndc. 


2.4. Theorem. (Bézout) Let C; and C2 be two plane curves of degree dy 
and dz in P? without any common components. Then 


S> mult(P; C1, C2) = dida. 
PEC1NC2 

Proof. We point out that the finiteness of CM C2 follows easily from the 
existence of non-zero polynomials such that a(x) fi(x, y) + (x) fo(z,y) = 
c(xz) and a’(y)fi(a,y) + O'(y) fe(a,y) = c(y). Up to changing projective 
coordinates, we can thus assume that the line at infinity does not intersect 
C1 MC. This condition translates to the fact that the homogeneous parts 
of largest degree, fi and fe. are relatively prime. We will now prove, 
using this hypothesis, that 


dim k[ax, y|/(fi, fa) = dydp. (B.3) 


Let Aq be the set of polynomials of k[z,y] of degree < d. It is a vector 
space of dimension s(d) = (ee) = (d+ 1)(d+2)/2. The map Ag > 
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k(x, y]/(fi, fa) is surjective for large enough d; the kernel Bg := AgM (fi, fa) 
contains Ig := Ag—a, fi + Aaq—afo. We will prove that Ig = Ba whenever 
d>d,+ dy. Let f = fi + gefe € Ba. We can assume that g, and go 
have minimal degrees e; and e2, and we want to show that e,; < d—d, and 
e2 < d—db. If we have, for example, e; > d— dj , then we see, considering 
the homogeneous parts of largest degree, that e, + dy = e2 + dz and that 
ght) flav) + gh?) fhe) = 0. Since A) and fp are relatively prime, we 
can deduce from this that gS? = —f{h and g) = fh, but then f = 
filgi — foh) + fo(g2+ fih) allows us to write f with polynomials of degree 
< e; and eg. Thus, for large enough d we have Aq/Iq & k[x,y]/(fi, fo). We 
know that Ad—d, fin Addo f2 = Ad—dy—d» fife, hence dim (k[x, yl/(fi, f2)) 
equals: 


dim Aa/la = dim A, — dim Ad—d, fi — dim Ad—dy fe + dim Ad-—d;—do fife 
= s(d) s(d d,) s(d dz) t s(d dy dz) 


= didz. 
The proof of the following equality finishes the proof of the theorem. 
dim ka, y]/(fi,f2)= 5 > mult(P;C,C2). (B.4) 
PEC{NC2 


This equality is a special case of the decomposition of a module with finite 
length. Recall that an A-module has finite length if there exists a sequence 
of submodules M = My D M, > --- D My = 0 such that each M;/M;41 is 
a simple A-module (i.e., a module of the type A/S, where M is a maximal 
ideal). We therefore have the decomposition 


M = @)M, 


where the (finite) sum is taken over the maximal ideals of A, and M, 
designates the localization of the module with respect to p. The proof of 
this last assertion can be done by induction on the length @. If €= 1, then 
M = A/M and M, = 0 if p # Mt whereas (A/M)mm = A/IM. Then, if we 
know the result for M4, (which has length < ¢— 1) and M/M, (which is 
simple), we can deduce the result for M using this. 


2.5. Remark. We have seen that we can describe the points of P” as 
lines through the origin of a vector space EF of dimension n+ 1 or as the 
hyperplanes of E*. We can generalize this by introducing the set G(E, k) 
of vector subspaces of F with a given dimension k and by endowing this set 
with the structure of a projective variety which is called the Grassmannian. 
To do this, we can proceed as follows: for a subspace L of dimension k in 
a vector space E of dimension n+ 1, we choose e1,...,e, to be a basis 
for L, and we denote by ¢(e1,...,en) = €1 A-+: A ex € A*(E) (if we 
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write the coordinates of the e; in a fixed basis of FE in a matrix, then 
the coordinates of e; A --- A e, are the k x k minors of this matrix). We 
can check that two bases of L generate the same line in A*(E). Hence, 
we can set ®(L) := [d(e1,...,ex)] € PA*(E). We then show that the map 
® : G(E,k) — PA*(E) is injective, and its image is a projective subvariety. 


2.6. Remark. More generally, we can define the intersection number of 
two subvarieties W, and W, of V of complementary dimensions, i.e., such 
that dimW, + dim W2 = dimV. The construction of these intersection 
numbers is out of the scope of this text (see [35] and most of all [81]), but 
we can easily state some of its properties. 


2.7. Definition. An algebraic cycle of codimension i on V is a linear 
combination with integer coefficients of subvarieties of codimension 7. The 
group of cycles of codimension 7 is denoted 


AV) = Beodim(w)—i ZW], 


where W ranges over the subvarieties of codimension 7. 


2.8. Proposition. Let V be a smooth projective variety of dimension r. 
There exists a Z-bilinear mapping, invariant under algebraic deformation, 


#FiV)x#-(V) + Z 
(W, WwW’) KH Ww 


such that if W and W’ intersect transversally at a finite number of points, 
then W-W’ =card(WOW’). If we further impose functoriality: for every 
finite morphism 6: V’ = V we have d~'(W) - ¢~!(W") = deg(¢)W - W’, 


then the mapping is unique. 


The notion of an algebraic deformation of cycles of V can be briefly de- 
scribed as follows. Let T be a variety and Z C V x T a subvariety such that 
for every point t € T, we can define a cycle Z; := ZNV x {t} on V. If ty 
and tg are in T’, we say that Z,, can be deformed into Z;,. The invariance 
property of the proposition can be translated into the fact that for every 
cycle W of dimension complementary to Z;, we have Z;,-W = Z, -W. 


2.9. Definition. Two cycles W,W’ € #*(V) are numerically equivalent if 
for every Y € #"~*(V), we have W-Y = W’-Y. The quotient of #*(V) 
by this equivalence relation is denoted Num‘(V). 


The numerical equivalence relation is of great importance in algebraic ge- 
ometry: it is at the heart of the theory of Grothendieck motives. We know 
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that rankz Num'(V) is finite. The formal properties of the numerical equiv- 
alence relation resemble those of the homological equivalence relation and 
Grothendieck also conjectures that they coincide. We have, for example, 
Num‘(P”) = Z. 


Appendix C 


Galois ‘Theory 


“I’m Nobody! Who are you? 

Are you nobody, too? 

Then there’s a pair of us! ? don’t tell! 
They’d banish us, you know!” 


Emity DIcKINSON 


By relying on some results from Galois theory (see, for example, [43]), 
we will more explicitly describe the decomposition law of prime ideals of 
rings in a number field in the case where the extension is Galois (see [7] 
for more details) before stating Chebotarev’s theorem which connects this 
algebraic theory to analytic theory and provides an elegant generalization 
of the Dirichlet’s theorem on arithmetic progressions. The last two sec- 
tions present the beginnings of the theory of Galois representations, 1.€., 
the study of the absolute Galois group Ga := Gal(Q/Q). First, class field 
theory (see [44]) provides a description of abelian extensions and also rep- 
resentations of dimension 1 (which allows us to state a vast generalization 
of the quadratic reciprocity law). Then, we give some examples and basic 
properties of representations of dimension > 1 (see [27], [63] and [64]). 


1. Galois Theory and Number Fields 


Let us briefly recall the fundamentals of Galois theory. 


Notation. We denote by Aut(F’) the group of automorphisms of a field. If 
K is a subfield of F’, then we denote by Aut(F/K) the subgroup of Aut(F’) 
which acts trivially on kK. If G is a subgroup of Aut(F’), we denote by 
FS := {xe F Vg €G, g(x) = x} the subfield fixed by G. 


An extension F'/K is Galois if it is normal and separable. In this case, 
we call the group Aut(F'/K) the Galois group of the extension, denoted by 
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Gal(F/K). If F/K is finite of degree n=|F': K], this amounts to saying 
that card Aut(F'/k) =n. To every subgroup H of Gal(F'/K), we can asso- 
ciate the extension F”, and to every extension L, we can associate the sub- 
group Aut(F/L)= Gal(F/L). The fundamental theorem of Galois theory 
(for finite extensions) says that we have thus established a bijection be- 
tween intermediate extensions K CLC F and subgroups of Gal(F/K). 
The same theorem further states that the extension F” is Galois over 
kK if and only if H is normal in Gal(F'/K), and if that is the case, then 
Gal(F" /K) = Gal(F/K)/H. More generally, if ¢ € Gal(F/K) and K Cc 
LC F, we have Gal(F/oL) = o Gal(F/L)o~!, and if H Cc Gal(F'/K), we 
have o (F#) = Foo * 

To generalize this to possibly infinite extensions, we introduce the Krull 
topology on G := Gal(F/K), where a basis of neighborhoods of the iden- 
tity is given by finite index subgroups of G. The Galois correspondence is 
therefore a bijection between subextensions K C L C F and closed sub- 
groups of G, which are finite extensions of K corresponding to subgroups 
of G which are both closed and open (clopen). 


1.1. Examples. 1) (Finite fields) The Galois group Gal(Fg»/F,) is 
canonically isomorphic to Z/mZ, the canonical generator being the “Frobe- 
nius” ®(x) = x7. We can deduce from this a description of the absolute 
Galois group: 
Gal(F,/Fq) = lim Z/mZ = | [ Ze 
" £ 


where the product is over prime numbers ¢ and Z, designates the ¢-adic 
integers. 

2) (Cyclotomic extensions) Let ¢ be a primitive nth root of unity (for exam- 
ple ¢ = exp(27i/n)). The Galois group Gal(Q(¢)/Q) is canonically isomor- 
phic to (Z/nZ)*, and the isomorphism o +> m(c) is given by o(¢) = C77) 
(we are using the irreducibility of cyclotomic polynomials here, Theorem 
2-6.2.7). From this, we can deduce a description of the Galois group of the 
extension Q(jugo ) generated by all of the @"th roots of unity: 


Gal(Q(ue~)/Q) = lim(Z/0"Z)* = 2. 


3) (Kummer extensions) Let K be a field containing the mth roots of 
unity, ie., fim C K*, and let a € K* and 3 € K such that B™ = a, which 
will be (slightly abusively) denoted by 6 = %/a. The extension K(3)/K 
is therefore Galois, and its Galois group is isomorphic to a subgroup of 
[4m, the injective homomorphism ¢ : Gal(K(3)/K) > fm being given by 
o(8) = C(o)p. 

4) (Extensions generated by torsion points of an elliptic curve) Let E be an 
elliptic curve defined over kK. We denote by K(E[N]) the field generated 
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by the coordinates (in K’) of the torsion points killed by N. The extension 
K(E|[N]) is Galois, and by remembering (cf. Theorem 5-5.6) that E[N] := 
Ker[N] & (Z/NZ)?, we see that Gal(K(E[N])/K) can be identified with a 
subgroup of Aut(Z/NZ x Z/NZ) = GLo(Z/NZ). 


We will now move on to describing the decomposition into prime ideals in 
Galois extensions (see, for example, [7] for the proofs). 


Let K/Q be a Galois extension with Galois group G := Gal(K/Q). Let p 
be a prime number and p a prime ideal of Gx over p. The decomposition 
group of p is 

D(p/p) = {0 €G | o(p) =p}. 


If o € D(p/p), we can define ¢ : Ox /p — Ox /p by the diagram 


to 
Ox/p —> Ox/p. 
We set Fy, = @x/p. The map o + @ defines a homomorphism ry : 


D(p/p) — Gal(F,/F,). By definition, the kernel is called the inertia group 
of p, in other words 


I(p/p) := {o € D(p/p) | Va € Ox, o(x) = xmodp}. 
1.2. Lemma. The Galois group G = Gal(K/Q) acts transitively on the 


set of prime ideals of Ox over p. The homomorphism r, : D(p/p) > 
Gal(F,/F,) is surjective. 


We will now do some calculations which allow us to see how the inertia 
groups and decomposition groups vary when we change the ideal p. 


1.3. Lemma. Let p and p’ be prime ideals of Ox over p, and leta € G 
such that o(p) =p’. Then 


D(p'/p)=oD(p/p)o-* and _—I(p'/p) = oI (p/p)o™*. 


As we recalled, Gal(F,/F,) is a cyclic group whose canonical generator is 
given by the Frobenius homomorphism ®(x) = x?. 


1.4. Definition. A Frobenius of p is an element o € D(p/p) such that 
Tp(o) = ®. That is to say, o satisfies 
Va € Ox, o(x) = x? modp. 


We denote by Frob, such an element (if we need to specify the field, we 
write Froby «/q@)- 
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1.5. Remarks. 1) If we replace p by p’ := o(p), we can easily see that 
Froby: = 0 Froby o~*. 


In particular, if the extension is abelian, the element only depends on p, and 
we can denote it by Frob,. Keep in mind, however, that the notation Frob, 
in general only designates a “conjugacy class modulo the inertia group”. 


2) We will again look at the example K = Q(1,,). Then if p does not divide 
n, the element o, = Frob, is well-defined and equals o,(¢) = ¢?. However, 
if, for example, n = p’, then the extension Q(jipm)/Q is totally ramified 
at p, and every element of the Galois group is a Frobenius at p. 


3) In the example K = Q(Vd), we can identify Gal(K/Q) with the group 
{+1,—1}, the nontrivial automorphism being given by o(a + bVd) = a— 
b/d. Let p be an odd prime which does not divide d. Then we have 


(va)" = a a= (4) Vdmodp, 


therefore Frob, is nothing other than the Legendre symbol of d with respect 
to p (ie., +1 if dis a square modulo p and —1 if not). 


4) We can generalize these notions to Galois extensions of a number field 
L/K. For example, if q is an ideal of L over p, an ideal of K, we denote by 
Frob, an element of Gal(L/K) such that Frobg(x) = aN? mod q. 


5) We have defined the decomposition group, the inertia group and the 
Frobenius element for finite Galois extensions. No real difficulties arise 
when extending these definitions to infinite extensions. For example, if 
we consider Ga := Gal(Q/Q) and p prime, we denote by Z the ring of 
algebraic integers and p a maximal ideal of Z over p. We set D(p/p) = {a € 
Gea | o(p) = p}, then define the reduction homomorphism r, : D(p/p) > 
Gal(F,,/F,) and let I(p/p) := Ker(rp), and finally let Froby be an element 
of D(p/p) whose image under ry is the Frobenius in characteristic p. 


By the previous lemma, such a Frobenius element associated to a prime 
ideal of K always exists and is unique modulo the inertia subgroup. Let us 
see when the inertia subgroup is trivial. 


1.6. Proposition. Let K/Q be a finite Galois extension with Galois group 
G := Gal(K/Q), and let p be a prime number. The decomposition of p in 
Or is written 


pOK = (P1-+*Pg)® 


where e = card I(p;/p), Npi = p!, ef = card D(p;/p) and g = (Gal(K/Q) : 
D(p;/p)). Then we have efg = [|K : Q]. 
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1.7. Corollary. With the same hypotheses and notations, we have: 


e=1< > pis unramified => p does not divide Ax. 


We can introduce an often useful filtration of the inertia group by defining 
the higher ramification groups: 


Gip — {o € D(p/p) | Va: E Ox, a(x) =amodp't'}. 


Thus Go is the inertia group. We can see that the G; are p-groups for 7 > 1. 
In fact, G; is the Sylow p-subgroup of Go, and, in particular, the G; are 
trivial whenever p does not divide card(G) and i > 1. 


We could ask ourselves how Froby varies when we vary the prime ideal. 
The response is given by the following theorem. 


1.8. Theorem. (Chebotarev) If C is a conjugacy class of G = Gal(K/Q), 
then there exist infinitely many prime numbers p such that Frobp ts in C. 
To be more precise, the density of such p is exactly |C|/|G|. 


The previous theorem is a vast generalization of the theorem on arithmetic 
progressions (see [44] for the proof). To see why, if we choose K = Q(¢) 
where ¢ = exp(27i/n) and an element a € (Z/nZ)* = Gal(K/Q), then the 
equality Frob, = a means p = amodn. 


2. Abelian Extensions 


If G is a group, we denote by G?> the quotient of G by its commutator 
group (i.e., the largest abelian quotient of G). We will describe—briefly and 
without proof—the group Gal(K/K)*» for a number field and indicate why 
this theory can be considered to have sprouted from the quadratic reciprocity 
law. 


Let L/K be a number field extension with an abelian Galois group. If p is 
a ideal in @x which is unramified in L/K, we have seen that the element 
Frob, € Gal(L/K) is well-defined. If we call S the set of prime ideals of 
ramified in L/K and I? the group of fractional ideals relatively prime to 
S, then we can define the homomorphism 
WL/K Ig — Gal(L/K) 
Ines pm p> II,gs Frob,’’, 
which we know to be surjective by Chebotarev’s theorem (Theorem C-1.8). 


The first step in analyzing the kernel is to see that norms of ideals in L are 
in Ker WL/K ; 


290 C. Galois Theory 


2.1. Lemma. Let L/K be an abelian extension and F/K an extension. 
If q is an ideal relatively prime to F over an ideal p of K, we denote by 
f :=[@Gr/q: @/p]. Then 


Froby nF/F = (Froby,1/«)*. 
More generally, we have the following formula for an ideal &X of F, 
Vieyjr (2) =vzyK (NK). (C.1) 


In particular, the norm of an ideal in L is in the kernel of Wr/K.- 


Proof. The statement of the lemma implies a natural identification of 
Gal(LF/F) with a subgroup of Gal(L/K). If rT := Frobg rrr, then we 
have 7(x) = aN4 mod q/ (for q/ relatively prime to LF over q and x € Op). 
If we restrict to x € Oy, we can thus write T(x) = aN’ mod q’ 16,1. We 
know that q’M @, is a prime ideal of L over p, which indeed shows that 
the restriction of r to L is equal to Frob/. The formula for the norms can 
be deduced by multiplicativity, and the last statement follows immediately 
from taking F = L. 


The second step in the analysis of the kernel of #,/% is much deeper and 
forms the core of class field theory. For the proof, you can refer, for exam- 
ple, to [44], but we should first introduce some vocabulary. 


2.2. Definition. Let K be a number field. A cycle It is given by an ideal 
Ox and a sign for every real place of K. Alternatively, we can write 


M = So mp [p] + D> role), 
p v|oo 
where m, € N are almost all zero and ny, = 0 or 1 if v is Archimedean. 
For a € @x, we write a = 1[M] if a = 1modp”™ and o,(a) > 0 if v is 


real and n, = 1, and we set: 


Pm ={A=aoK |a=1[M}. 


2.3. Theorem. (Artin reciprocity law) Let L/K be an abelian extension 
which is unramified outside of a set S of places of K. There exists a cycle 
IM with support in S such that Py C KerypsK (such a cycle is called 
“admissible”). Moreover, we have the equality 

Ker ps5 = PmN (I7") (C.2) 


and consequently the following isomorphism 


WrK 1 1x / Pon N (Ip") & Gal(L/K). (C.3) 
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2.4. Remark. (Hilbert class field) We consider the case of an unramified 
(including at Archimedean places) abelian extension L/K. We then obtain 
a reciprocity homomorphism: 


vijK > Cle :=Ig/P — Gal(L/K). 


Note that the condition of being unramified at Archimedean places signifies 
that the real places of K are uniquely extended to real places of L, and 
that if a homomorphism o : L = C satisfies o(K) C R, then o(L) C R. 
Since the composition of two unramified abelian extensions is unramified 
abelian, we see that there exists a maximal unramified abelian extension, 
called the Hilbert class field of kK. In this case, we can specify the kernel of 


WL/K- 


2.5. Proposition. Let H/K be the Hilbert class field of K (i.e., the 
maximal unramified abelian extension). Then 


WH/K : Cle —> Gal(H/Kk) 


is an isomorphism. Furthermore, every ideal 2 of K becomes principal in 
H (i.e., Ay is a principal ideal). 


Class field theory also allows us to “classify” abelian extensions, but for 
this, a more elegant method comes from Chevalley, namely to introduce 
Jx, the idele group of K (see the last part of the section on p-adic numbers 
in Chap. 6). To do this, we can reformulate the Artin reciprocity law using 
the following lemma. 


2.6. Lemma. Let L/K be an abelian extension of a number field and let 
M be an admissible cycle (i.e., such that Py C Kerp/K). We then have 
the following natural isomorphism: 


Jx/K* N (Jz) — IR /PonN (TZ). (C.4) 


The isomorphism is obtained by associating to an idele a = (@y)vemx; 
whose coordinates are units at every finite v appearing in It (resp. a, > 0 
if v is real Archimedean and n, = 1), the ideal |, pordeten). 


2.7. Corollary. If L/K is a abelian extension, we have a surjective 
homomorphism: 


JK — Gal(L/K), 
whose kernel is K* N (Jz). 
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2.8. Theorem. (Class field theory) The correspondence defined by 
Lt K*N (Jz) 


establishes a bijection between abelian extensions of K and closed subgroups 
of JK containing K* (i.e., closed subgroups of CK := Jx/K*). Finite 
extensions therefore correspond to subgroups which are both open and closed. 


If H (resp H') is the subgroup associated to L/K (resp. to L'/K), then 
LC L' if and only if H' C H. The subgroup associated to LL'/K is 
HH’, and the subgroup associated to LN L'/K is HH’. In other words, 
if H = K*N(Jz), then (L: K) = (Jn: H) and Gal(L/K) = Jx/H. 
Furthermore, if v is a place of K, the decomposition group (resp. the inertia 
group) is the image of K* — Jx — JK/H (resp. the image of G3). In 
particular, L/K is unramified at v if and only if Of CH. 


2.9. Examples. Let It = >/, mp [p] + D2.) 0 Mvlv] be a cycle of kK. We 
denote by 


Je :={(av) vem, €IK | dy =1modp”® and o,(a)>Oif v is real and ny =1}. 
The abelian extension K™ of K associated to the group K*J?! is called 
the ray class field modulo Mt. If we restrict ourselves to the case K = Q 
and M = m(oo) = Ji, ord,(m)[p] + [co], we can prove that Qqrco) = 


Q(exp(27i/m)). In particular, we thus obtain the following classical result. 


2.10. Theorem. (Kronecker-Weber) Every abelian extension of Q is 
contained in a cyclotomic extension (generated by the roots of unity). In 
particular, 


Gal(Q/Q)*? = Gal(Q(icc/Q) = 112. (C5) 


2.11. Remark. Let p and q be two distinct odd primes. Consider the 


field K = Q(/q’) (where q’ = (—1) 2 q, so that K/Q is only ramified 
at q and, if gq = 3mod4, at oo), and identify Gal(K/Q) with {+1, —-1}. 
/ 

We have seen (Remark C-1.5, part 3) that Frob, = (+). The Artin 
reciprocity law (Theorem C-2.3) tells us that this element only depends on 
the congruence class of p modulo X0 = q, and it can be proven that it is 
equal to the identity, +1, if and only if p is a square modulo qg. We thus 
obtain the quadratic reciprocity law in the form 
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3. Galois Representations 


The study of the group Gg =Gal(Q/Q), or more generally Gi, :=Gal(Q/L) 
for a number field L, is clearly a fundamental problem. One way to attack 
it is to study the representations of this group, i.e. the homomorphisms 
p: Ge — GL(V) = GL, (K) where V is a vector space over K of dimension 
n. The three most interesting cases are K = F, (a finite field), K = Qy (a 
p-adic field) and K = C. 


The previous section essentially corresponds to one-dimensional represen- 
tations, since a representation Gz — GL (i) = K™* has an abelian image 
and can therefore be factored through G3). 


3.1. Definition. An Artin representation of K is a continuous finite 
dimensional representation p: Gal(k/K) > GL(V) = GL,(C). 


Since Gx is compact and discontinuous, the image of p is necessarily finite 
and L := Ker») is thus a finite extension of K. The representation is 
therefore factored though the representation of a finite group Gal(L/K) > 
GL,,(C). We say that p is unramified at p if I(p/p) C Ker(p). It is clear 
that p is unramified outside of a finite set of prime numbers. 


3.2. Definition. The Artin conductor of p: G— GL(V) is defined as: 


N,:= [[2"?” 
P 


where 


So. dim V/V 
n(p, p) = 
d (Go,p : Gi,p) 


Here p designates an ideal of K over p. It can be shown that this formula 
does not depend on the choice of p. It is clear that n(p,p) = 0 if p is 
not ramified over p. Next, if G1, = {1} (no “wild” ramification), we have 
n(p, p) = dim V — dim V/‘/?), In the general case it is still true, but more 
tricky to prove, that n(p, p) is an integer. 


We can essentially identify one-dimensional Artin representations over Q 
with Dirichlet characters in the following sense. Given a Dirichlet character 
x : (Z/nZ)* — C*, we associate an Artin representation to it by the 
following diagram: 


Gal(Q/Q) > Gal(Q(pin)/Q) © (Z/nZ)* + C* = GLi(C). 


Furthermore, Weber’s theorem tells us that we obtain all the one-dimension- 
al Artin representations of Gg this way. 
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Let p: Ga — GL(V) be an Artin representation. We define the character- 
istic polynomial at p of p as follows: we choose p to be a prime ideal over 
p and set 


P,(p;T) := det (1 — p(Frob,)T | vite/)) (C.6) 


Froby is only defined modulo I(p/p), and the action on V7 (e/P) only depends 
on the chosen representative. Finally, the determinant does not depend on 
the conjugacy class of Froby, thus only depends on p. We know that (see, 
for example, [43] or [63]) a representation of a finite group is determined by 
its character, in other words, by the trace function xy, = Trop. By using 
the elementary formula 


det (I — TA)~* = exp s nor a 


m=1 


where A is a square matrix, we can write the previous definition as 


oe. Frob;” 
P,(p,T)~* = exp y EP om ; (C.7) 


m=1 


where, in the ramified case, we restrict the representation to V/(°/), 


3.3. Definition. Let p: Ga — GL(V) be an Artin representation. Its 
L-function is defined as 


—-1 
L(p, 8) = T] Po(osp-*)-? = [] det (1 — p(Probp)p-* | V7") 
P P 


Since the eigenvalues have absolute value 1, we can easily see that the Euler 
product converges absolutely for Re(s) > 1. This construction generalizes 
Dirichlet [-series, which we recover whenever the representation is one- 
dimensional. In fact, a famous theorem of Brauer on representations of 
finite groups (see, for example, [63]) shows that Artin L(p,s) functions can 
be written in the form of a product [] L(x, s)"’, where m; € Z (and where 
the x; are abelian characters which generalize Dirichlet characters). Since 
we know the analytic continuation of the series L(x, s) and their functional 
equation, we can deduce from this a meromorphic continuation of L(p, s) 
to the complex plane with a functional equation. Artin conjectured that in 
fact L(p, 8) is everywhere holomorphic, except for a possible pole at s = 1 
with order equal to the multiplicity of the trivial representation in p. 


To write the functional equation, we introduce the dimension n = dim V 
of the representation and the element c € Gal(Q/Q) defined by complex 
conjugation, and we denote by nt = dimVt and n~ = dimV~, where Vt 
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(resp. V—) is the subspace with eigenvalue +1 (resp. —1) for p(c). We 
then let 'p(s) := 7~*/?T(s/2) and 


Ap, 8) = Ng?TR(s)"'Tr(s +1)" L(p,s). 
The functional equation is therefore written as 
A(p, s) = weA(p, [= 8), (C.8) 


where |w,| = 1 and # is the dual representation.! 


3.4. Remark. We have introduced Artin representations over Q, but 
no difficulties arise when generalizing to continuous representations p : 


Gal(Q/K) > GL(V). 


3.5. Definition. An ¢-adic representation is a continuous representation 
p: Gal(K/K) > GL, (Qyz). 


We often assume the following added condition: the representation is un- 
ramified outside of a finite set of primes p of Gx. This condition is auto- 
matically satisfied in the case of Artin representations, but not in the case 
of €-adic representations. 


3.6. Examples. 1) Let Kyo be the field generated by the @"th roots of 
unity (for an arbitrary n). We can associate to it the following representa- 
tion (christened “the cyclotomic character”): 


Gal(Q/Q) — Gal(Kr~ /Q) = (Ze)* = GLi (Ze) > GLi(Qr). 
2) Let E'/Q be an elliptic curve defined over Q, and let 
E(€"] := Ker {[£"] : E(Q) > E(Q)}. 


Recall the definition of a Tate module, T;(£) := lim, E[¢"]. Since T)(£) = 
(Ze)? (5-5.8), and since the Galois group acts Z-linearly on E["], it acts 
Zy-linearly on T;(£), and we thus obtain a representation 


pee : Gal(Q/Q) — GL(Ty(E)) © GLa(Ze) + GL2(Qr). 


This representations happens to be unramified outside of @ and the places 
of bad reduction of the elliptic curve (see [70]). Moreover, by composing 
with the determinant, we obtain a representation det opz, : Gal(Q/Q) > 
GLi(Ze) = Z}, which coincides with the cyclotomic character (cf. for 
example [70]). The conductor of the representation is defined as in C-3.2. 


lifp:G—a GL(V) and if V* is the vector space dual to V, the dual representation 
p:G— GL(V*) is given by ((9)(v),v*) = (v, A(g71)(0")). 
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In fact, it can be shown that the exponent n(p, p) is, for p 4 £, independent 
of @. This allows us to define, in an abstract manner, the conductor of E. 


In general, by making use of the compactness of G = Gal(K/K), we see 
that there exists in V = Q/ a lattice A = Z? which is stable under p(G). 
To do this, we choose a basis vj,...,Un of the Qe-vector space, we set 
Ao = Zevi +--+ + Zevn and A := G-Ao, and we prove that (A : Ag) is 
finite. Thus, up to a change of basis in V, we can always assume that p has 
values in GL;,(Ze). We can see however that the image p(G) is not finite 
in general. The @-adic representations are in this sense richer than complex 
representations. 


3.7. Definition. A representation mod £ is a continuous representation 
p: Gal(K/K) > GL,(F¢) (or more generally GL, (Fgm)). 


Such a representation obviously has a finite image. It therefore factors 
through the representation of a finite Galois group. One way to obtain such 
representations is to reduce an é-adic representation modulo ¢. In other 
words, starting with an ¢-adic representation p : Gal(K/K) — GL,(Q,z) 
which is normalized so that it has values in GL,(Z,), we can compose with 
the reduction homomorphism re : GL,,(Ze) > GL,(F,) and thus obtain a 
representation: 


p:=1,0p: Gal(K/K) > GLn(Ze) — GL, (Fy). 


More generally, if A is a ring and r : A — F, a homomorphism, we say that 
a representation p : Gal(K/K) + GL,,(F,) can be lifted to A if there exists 
p: Gal(K/K) — GL,,(A) such that p = ro p. This can be represented by 
the diagram 

GLn(A) 


r 


Gal(&/K) 


GL, (F,) 


In the case K = Q, n = 2 and # is irreducible and odd (i.e., the image of 
the complex conjugation is of determinant —1), a conjecture of Serre, for 
the statement of which we refer you to [66], describes these representations 
as coming from “modular” representations. This conjecture has just been 
proven by Khare and Wintenberger. 
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